SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
Our Environment Challenges and Solutions Conclusions
PostgreSQL at 20TB and Beyond
Analytics at a Massive Scale
Chris Travers
Adjust GmbH
January 25, 2018
Our Environment Challenges and Solutions Conclusions
About Adjust
Adjust is a market leader in mobile
advertisement attribution. We basically
act as a referee in pay-per-install
advertising. We focus on fairness,
fraud-prevention and other ways of
ensuring that advertisers pay for the
services they receiving fairly and with
accountability.
Our Environment Challenges and Solutions Conclusions
Our Environment
About Adjust
Traffic Statistics
Analytics Environment
Challenges and Solutions
Staffing
Throughput
Autovacuum
Data Modeling
Backups and Operations
Conclusions
Our Environment Challenges and Solutions Conclusions
Basic Facts About Us
• We are a PostgreSQL/Kafka shop
• Around 200 employees worldwide
• Link advertisements to installs
• Delivering near-real-time analytics to software vendors
• Our Data is “Big Data”
Our Environment Challenges and Solutions Conclusions
Just How Big?
• Over 100k requests per second
• Over 2 trillion data points tracked in 2017
• Over 400 TB of data to analyze
• Very high data velocity
Our Environment Challenges and Solutions Conclusions
General Architecture
• Requests come from Internet
• Written to backends
• Materialized to analytics
shards
• Shown in dashboard
Our Environment Challenges and Solutions Conclusions
Common Elements of Infrastructure
• Bare metal
• Stripped down Gentoo
• Lots of A/B performance testing
• Approx 50% more throughput than stock Linux systems
• Standard PostgreSQL + extensions
Our Environment Challenges and Solutions Conclusions
Backend Servers
• Original point of entry
• Data distributed by load balancer
• No data-dependent routing.
• Data distributed more or less randomly
• Around 20TB per backend server gets stored.
• More than 20 backend servers
Our Environment Challenges and Solutions Conclusions
Materializer
• Aggregates new events
• Copies the aggregations to the shards
• Runs every few minutes
• New data only
Our Environment Challenges and Solutions Conclusions
Materializer and MapReduce
Our materializer aggregates data from many servers and transfers
it to many servers. It functions sort of like a mapreduce with the
added complication that it is a many server to many server
transformation.
Our Environment Challenges and Solutions Conclusions
Analytics Shards
• Each about 2TB each
• 16 shards currently, may grow
• Our own custom analytics software for managing and querying
• Custom sharding/locating software
• Paired for redundancy
Our Environment Challenges and Solutions Conclusions
Staffing: Challenges
• No Junior Database Folks
• Demanding environment
• Very little room for error
• Need people who are deeply grounded in both theory and
practice
Our Environment Challenges and Solutions Conclusions
Staffing: Solutions
• Look for people with enough relevant experience they can
learn
• Be picky with what we are willing to teach new folks
• Look for self-learners with enough knowledge to participate
• Expect people to grow into the role
• We also use code challenges
Our Environment Challenges and Solutions Conclusions
Throughput challenges
• Most data is new
• It is a lot of data
• Lots of btrees with lots of random inserts
• Ideally, every wrote inserted once and updated once
• Long-term retention
Our Environment Challenges and Solutions Conclusions
Throughput Solutions
Backend Servers
• Each point of entry server has its own database
• Transactional processing is separate.
• Careful attention to alignment issues
• We write our own data types in C to help
• Tables partitioned by event time
Our Environment Challenges and Solutions Conclusions
Throughput Solutions
Analytics Shards
• Pre-aggregated data for client-facing metrics
• Sharded at roughly 2TB per shard
• 16 shards currently
• Custom sharding framework optimized to reduce network
usage
• Goal is to have dashboards load fast.
• We know where data is on these shards.
Our Environment Challenges and Solutions Conclusions
Throughput Solutions
Materializer
• Two phases
• First phase runs on original entry servers
• Aggregates and copies data to analytics shards
• Second phase runs on analytics shards
• further aggregates and copies.
Our Environment Challenges and Solutions Conclusions
Materializer: A Special Problem
• Works great when you only have one data center
• Foreign data wrapper bulk writes are very slow across data
centers
• This is a known issue with the Postgres FDW
• This is a blocking issue.
Our Environment Challenges and Solutions Conclusions
Materializer: Solution
• C extension using COPY
• Acts as libpq client
• Wrote a global transaction manager
• Throughput restored.
Our Environment Challenges and Solutions Conclusions
Introducing Autovacuum
• Queries have to provide consistent snapshots
• All updates in PostgreSQL are copy-on-write
• In our case, we write once and then update once.
• Have to clean up old data at some point
• By default, 50 rows plus 20% of table being “dead” triggers
autovacuum
Our Environment Challenges and Solutions Conclusions
Autovacuum problems
• For small tables, great but we have tables with 200M rows
• 20% of 200M rows is 40 million dead tuples....
• Autovacuum does nothing and then undertakes a heavy
task....
• performance suffers and tables bloat.
Our Environment Challenges and Solutions Conclusions
Autovacuum Solutions
• Change to 150k rows plus 0%
• Tuning requires a lot of hand-holding
• Roll out change to servers gradually to avoid overloading
system.
Our Environment Challenges and Solutions Conclusions
Why it Matters
• Under heavy load, painful to change
• Want to avoid rewriting tables
• Want to minimize disk usage
• Want to maximize alignment to pages
• Lots of little details really matter
Our Environment Challenges and Solutions Conclusions
Custom 1-byte Enums
• Country
• Language
• OS Name
• Device Type
Our Environment Challenges and Solutions Conclusions
IStore
Like HStore but for Integers
• Like HStore but for integers
• Supports addition, etc, between values of same key
• Useful for time series and other modelling problems
• Supports GIN indexing among others
Our Environment Challenges and Solutions Conclusions
The Elephant in the Room
How do we aggregate that much data?
• Basically incremental Map Reduce
• Map and first phase aggregation on backends
• Reduce and second phase aggregation on shards
• Further reduction and aggregation possible on demand
Our Environment Challenges and Solutions Conclusions
Operations Tools
• Sqitch
• Rex
• Our own custom tools
Our Environment Challenges and Solutions Conclusions
Backups
• Home grown system
• Base backup plus WAL
• Runs as a Rex task
• We can also do logical backups (but...)
Our Environment Challenges and Solutions Conclusions
Ongoing Distributed Challenges
• Major Upgrades
• Storage Space
• Multi-datacenter challenges
• Making it all fast
Our Environment Challenges and Solutions Conclusions
Overview
This environment is all about careful attention to detail and being
willing to write C code when needed. Space savings, better
alignment, and other seemingly small gains add up over tens of
billions of rows.
Our Environment Challenges and Solutions Conclusions
Major Points of Interest
• We are using PostgreSQL as a big data platform.
• We expect this architecture to scale very far.
• Provides near-real-time analytics on user actions.
Our Environment Challenges and Solutions Conclusions
PostgreSQL makes all this Possible
In buiding our 400TB analytics environment we have yet to
outgrow PostgreSQL. In fact, this is one of the few pieces of our
infrastructure we are perfectly confident in scaling.

Mais conteúdo relacionado

Mais procurados

Optimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceOptimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceMariaDB plc
 
How to find what is making your Oracle database slow
How to find what is making your Oracle database slowHow to find what is making your Oracle database slow
How to find what is making your Oracle database slowSolarWinds
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 MinutesSveta Smirnova
 
Oracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion EditionOracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion EditionMarkus Michalewicz
 
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.NAVER D2
 
Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1
Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1
Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1SolarWinds
 
Auditing and Monitoring PostgreSQL/EPAS
Auditing and Monitoring PostgreSQL/EPASAuditing and Monitoring PostgreSQL/EPAS
Auditing and Monitoring PostgreSQL/EPASEDB
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaGuozhang Wang
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions Yugabyte
 
MMUG18 - MySQL Failover and Orchestrator
MMUG18 - MySQL Failover and OrchestratorMMUG18 - MySQL Failover and Orchestrator
MMUG18 - MySQL Failover and OrchestratorSimon J Mudd
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentJean-François Gagné
 
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and AdvisorsYour tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and AdvisorsJohn Kanagaraj
 
Extreme Replication - Performance Tuning Oracle GoldenGate
Extreme Replication - Performance Tuning Oracle GoldenGateExtreme Replication - Performance Tuning Oracle GoldenGate
Extreme Replication - Performance Tuning Oracle GoldenGateBobby Curtis
 
MySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software TestMySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software TestI Goo Lee
 
Galera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesGalera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesSeveralnines
 
Using all of the high availability options in MariaDB
Using all of the high availability options in MariaDBUsing all of the high availability options in MariaDB
Using all of the high availability options in MariaDBMariaDB plc
 
What to Expect From Oracle database 19c
What to Expect From Oracle database 19cWhat to Expect From Oracle database 19c
What to Expect From Oracle database 19cMaria Colgan
 
Oracle ZDM KamaleshRamasamy Sangam2020
Oracle ZDM KamaleshRamasamy Sangam2020Oracle ZDM KamaleshRamasamy Sangam2020
Oracle ZDM KamaleshRamasamy Sangam2020Kamalesh Ramasamy
 
Running MariaDB in multiple data centers
Running MariaDB in multiple data centersRunning MariaDB in multiple data centers
Running MariaDB in multiple data centersMariaDB plc
 

Mais procurados (20)

Optimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceOptimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performance
 
How to find what is making your Oracle database slow
How to find what is making your Oracle database slowHow to find what is making your Oracle database slow
How to find what is making your Oracle database slow
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 Minutes
 
Oracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion EditionOracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion Edition
 
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
 
Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1
Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1
Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1
 
Auditing and Monitoring PostgreSQL/EPAS
Auditing and Monitoring PostgreSQL/EPASAuditing and Monitoring PostgreSQL/EPAS
Auditing and Monitoring PostgreSQL/EPAS
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache Kafka
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
 
MMUG18 - MySQL Failover and Orchestrator
MMUG18 - MySQL Failover and OrchestratorMMUG18 - MySQL Failover and Orchestrator
MMUG18 - MySQL Failover and Orchestrator
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and AdvisorsYour tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
 
Oracle RAC 12c Overview
Oracle RAC 12c OverviewOracle RAC 12c Overview
Oracle RAC 12c Overview
 
Extreme Replication - Performance Tuning Oracle GoldenGate
Extreme Replication - Performance Tuning Oracle GoldenGateExtreme Replication - Performance Tuning Oracle GoldenGate
Extreme Replication - Performance Tuning Oracle GoldenGate
 
MySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software TestMySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software Test
 
Galera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesGalera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction Slides
 
Using all of the high availability options in MariaDB
Using all of the high availability options in MariaDBUsing all of the high availability options in MariaDB
Using all of the high availability options in MariaDB
 
What to Expect From Oracle database 19c
What to Expect From Oracle database 19cWhat to Expect From Oracle database 19c
What to Expect From Oracle database 19c
 
Oracle ZDM KamaleshRamasamy Sangam2020
Oracle ZDM KamaleshRamasamy Sangam2020Oracle ZDM KamaleshRamasamy Sangam2020
Oracle ZDM KamaleshRamasamy Sangam2020
 
Running MariaDB in multiple data centers
Running MariaDB in multiple data centersRunning MariaDB in multiple data centers
Running MariaDB in multiple data centers
 

Semelhante a PostgreSQL at 20TB and Beyond

PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform Chris Travers
 
Why retail companies can't afford database downtime
Why retail companies can't afford database downtimeWhy retail companies can't afford database downtime
Why retail companies can't afford database downtimeDBmaestro - Database DevOps
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityRandy Shoup
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, whenEugenio Minardi
 
DNN-Connect 2019: DNN Horror Stories
DNN-Connect 2019: DNN Horror StoriesDNN-Connect 2019: DNN Horror Stories
DNN-Connect 2019: DNN Horror StoriesWill Strohl
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolEDB
 
Scaling apps for the big time
Scaling apps for the big timeScaling apps for the big time
Scaling apps for the big timeproitconsult
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 
CD presentation march 12th, 2018
CD presentation march 12th, 2018CD presentation march 12th, 2018
CD presentation march 12th, 2018Ran Levy
 
Mixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting exampleMixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting examplecorehard_by
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoopch adnan
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that growGibraltar Software
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalabilityGuy Tomer
 
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...AOE
 
DBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
DBmaestro's State of the Database Continuous Delivery Survey- Findings RevealedDBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
DBmaestro's State of the Database Continuous Delivery Survey- Findings RevealedDBmaestro - Database DevOps
 

Semelhante a PostgreSQL at 20TB and Beyond (20)

PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform
 
No stress with state
No stress with stateNo stress with state
No stress with state
 
Why retail companies can't afford database downtime
Why retail companies can't afford database downtimeWhy retail companies can't afford database downtime
Why retail companies can't afford database downtime
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
 
DNN-Connect 2019: DNN Horror Stories
DNN-Connect 2019: DNN Horror StoriesDNN-Connect 2019: DNN Horror Stories
DNN-Connect 2019: DNN Horror Stories
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic Tool
 
A data analyst view of Bigdata
A data analyst view of Bigdata A data analyst view of Bigdata
A data analyst view of Bigdata
 
Scaling apps for the big time
Scaling apps for the big timeScaling apps for the big time
Scaling apps for the big time
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
CD presentation march 12th, 2018
CD presentation march 12th, 2018CD presentation march 12th, 2018
CD presentation march 12th, 2018
 
Voldemort Nosql
Voldemort NosqlVoldemort Nosql
Voldemort Nosql
 
Mixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting exampleMixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting example
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that grow
 
In (database) automation we trust
In (database) automation we trustIn (database) automation we trust
In (database) automation we trust
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalability
 
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
 
DBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
DBmaestro's State of the Database Continuous Delivery Survey- Findings RevealedDBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
DBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
 

Último

Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfisabel213075
 
priority interrupt computer organization
priority interrupt computer organizationpriority interrupt computer organization
priority interrupt computer organizationchnrketan
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSsandhya757531
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier Fernández Muñoz
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewsandhya757531
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodManicka Mamallan Andavar
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfBalamuruganV28
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxStephen Sitton
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Communityprachaibot
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
Forming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptForming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptNoman khan
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmDeepika Walanjkar
 

Último (20)

Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdf
 
priority interrupt computer organization
priority interrupt computer organizationpriority interrupt computer organization
priority interrupt computer organization
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptx
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overview
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument method
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdf
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptx
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Community
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
Forming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptForming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).ppt
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
 

PostgreSQL at 20TB and Beyond

  • 1. Our Environment Challenges and Solutions Conclusions PostgreSQL at 20TB and Beyond Analytics at a Massive Scale Chris Travers Adjust GmbH January 25, 2018
  • 2. Our Environment Challenges and Solutions Conclusions About Adjust Adjust is a market leader in mobile advertisement attribution. We basically act as a referee in pay-per-install advertising. We focus on fairness, fraud-prevention and other ways of ensuring that advertisers pay for the services they receiving fairly and with accountability.
  • 3. Our Environment Challenges and Solutions Conclusions Our Environment About Adjust Traffic Statistics Analytics Environment Challenges and Solutions Staffing Throughput Autovacuum Data Modeling Backups and Operations Conclusions
  • 4. Our Environment Challenges and Solutions Conclusions Basic Facts About Us • We are a PostgreSQL/Kafka shop • Around 200 employees worldwide • Link advertisements to installs • Delivering near-real-time analytics to software vendors • Our Data is “Big Data”
  • 5. Our Environment Challenges and Solutions Conclusions Just How Big? • Over 100k requests per second • Over 2 trillion data points tracked in 2017 • Over 400 TB of data to analyze • Very high data velocity
  • 6. Our Environment Challenges and Solutions Conclusions General Architecture • Requests come from Internet • Written to backends • Materialized to analytics shards • Shown in dashboard
  • 7. Our Environment Challenges and Solutions Conclusions Common Elements of Infrastructure • Bare metal • Stripped down Gentoo • Lots of A/B performance testing • Approx 50% more throughput than stock Linux systems • Standard PostgreSQL + extensions
  • 8. Our Environment Challenges and Solutions Conclusions Backend Servers • Original point of entry • Data distributed by load balancer • No data-dependent routing. • Data distributed more or less randomly • Around 20TB per backend server gets stored. • More than 20 backend servers
  • 9. Our Environment Challenges and Solutions Conclusions Materializer • Aggregates new events • Copies the aggregations to the shards • Runs every few minutes • New data only
  • 10. Our Environment Challenges and Solutions Conclusions Materializer and MapReduce Our materializer aggregates data from many servers and transfers it to many servers. It functions sort of like a mapreduce with the added complication that it is a many server to many server transformation.
  • 11. Our Environment Challenges and Solutions Conclusions Analytics Shards • Each about 2TB each • 16 shards currently, may grow • Our own custom analytics software for managing and querying • Custom sharding/locating software • Paired for redundancy
  • 12. Our Environment Challenges and Solutions Conclusions Staffing: Challenges • No Junior Database Folks • Demanding environment • Very little room for error • Need people who are deeply grounded in both theory and practice
  • 13. Our Environment Challenges and Solutions Conclusions Staffing: Solutions • Look for people with enough relevant experience they can learn • Be picky with what we are willing to teach new folks • Look for self-learners with enough knowledge to participate • Expect people to grow into the role • We also use code challenges
  • 14. Our Environment Challenges and Solutions Conclusions Throughput challenges • Most data is new • It is a lot of data • Lots of btrees with lots of random inserts • Ideally, every wrote inserted once and updated once • Long-term retention
  • 15. Our Environment Challenges and Solutions Conclusions Throughput Solutions Backend Servers • Each point of entry server has its own database • Transactional processing is separate. • Careful attention to alignment issues • We write our own data types in C to help • Tables partitioned by event time
  • 16. Our Environment Challenges and Solutions Conclusions Throughput Solutions Analytics Shards • Pre-aggregated data for client-facing metrics • Sharded at roughly 2TB per shard • 16 shards currently • Custom sharding framework optimized to reduce network usage • Goal is to have dashboards load fast. • We know where data is on these shards.
  • 17. Our Environment Challenges and Solutions Conclusions Throughput Solutions Materializer • Two phases • First phase runs on original entry servers • Aggregates and copies data to analytics shards • Second phase runs on analytics shards • further aggregates and copies.
  • 18. Our Environment Challenges and Solutions Conclusions Materializer: A Special Problem • Works great when you only have one data center • Foreign data wrapper bulk writes are very slow across data centers • This is a known issue with the Postgres FDW • This is a blocking issue.
  • 19. Our Environment Challenges and Solutions Conclusions Materializer: Solution • C extension using COPY • Acts as libpq client • Wrote a global transaction manager • Throughput restored.
  • 20. Our Environment Challenges and Solutions Conclusions Introducing Autovacuum • Queries have to provide consistent snapshots • All updates in PostgreSQL are copy-on-write • In our case, we write once and then update once. • Have to clean up old data at some point • By default, 50 rows plus 20% of table being “dead” triggers autovacuum
  • 21. Our Environment Challenges and Solutions Conclusions Autovacuum problems • For small tables, great but we have tables with 200M rows • 20% of 200M rows is 40 million dead tuples.... • Autovacuum does nothing and then undertakes a heavy task.... • performance suffers and tables bloat.
  • 22. Our Environment Challenges and Solutions Conclusions Autovacuum Solutions • Change to 150k rows plus 0% • Tuning requires a lot of hand-holding • Roll out change to servers gradually to avoid overloading system.
  • 23. Our Environment Challenges and Solutions Conclusions Why it Matters • Under heavy load, painful to change • Want to avoid rewriting tables • Want to minimize disk usage • Want to maximize alignment to pages • Lots of little details really matter
  • 24. Our Environment Challenges and Solutions Conclusions Custom 1-byte Enums • Country • Language • OS Name • Device Type
  • 25. Our Environment Challenges and Solutions Conclusions IStore Like HStore but for Integers • Like HStore but for integers • Supports addition, etc, between values of same key • Useful for time series and other modelling problems • Supports GIN indexing among others
  • 26. Our Environment Challenges and Solutions Conclusions The Elephant in the Room How do we aggregate that much data? • Basically incremental Map Reduce • Map and first phase aggregation on backends • Reduce and second phase aggregation on shards • Further reduction and aggregation possible on demand
  • 27. Our Environment Challenges and Solutions Conclusions Operations Tools • Sqitch • Rex • Our own custom tools
  • 28. Our Environment Challenges and Solutions Conclusions Backups • Home grown system • Base backup plus WAL • Runs as a Rex task • We can also do logical backups (but...)
  • 29. Our Environment Challenges and Solutions Conclusions Ongoing Distributed Challenges • Major Upgrades • Storage Space • Multi-datacenter challenges • Making it all fast
  • 30. Our Environment Challenges and Solutions Conclusions Overview This environment is all about careful attention to detail and being willing to write C code when needed. Space savings, better alignment, and other seemingly small gains add up over tens of billions of rows.
  • 31. Our Environment Challenges and Solutions Conclusions Major Points of Interest • We are using PostgreSQL as a big data platform. • We expect this architecture to scale very far. • Provides near-real-time analytics on user actions.
  • 32. Our Environment Challenges and Solutions Conclusions PostgreSQL makes all this Possible In buiding our 400TB analytics environment we have yet to outgrow PostgreSQL. In fact, this is one of the few pieces of our infrastructure we are perfectly confident in scaling.