SlideShare uma empresa Scribd logo
1 de 19
Django and working with
large database tables
Django Stockholm Meetup Group
March 30, 2017
About me
Ilian Iliev
Platform Engineer at Lifesum
ilian@ilian.io
www.ilian.io
The setup
2.5GHz i7, 16GB Ram, MacBook Pro
Django 1.10
MySQL 5.7.14
PostgreSQL 9.5.4
The Models
class Tag(models.Model):
name = models.CharField(max_length=255)
class User(models.Model):
name = models.CharField(max_length=255)
date = models.DateTimeField(null=True)
class Message(models.Model):
sender = models.ForeignKey(User, related_name='sent_messages')
receiver = models.ForeignKey(User, related_name='recieved_messages', null=True)
tags = models.ManyToManyField(Tag)
The Change
class Message(models.Model):
sender = models.ForeignKey(User, related_name='sent_messages')
receiver = models.ForeignKey(User, related_name='recieved_messages', null=True)
tags = models.ManyToManyField(Tag, blank=True)
The weird migration
ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY
`big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id`;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id` FOREIGN KEY (`tag_id`)
REFERENCES `big_tables_tag` (`id`);
ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY
`big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id`;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id` FOREIGN KEY
(`message_id`) REFERENCES `big_tables_message` (`id`);
MySQL
Rows ~ 2.7M
Size ~ 88MB
message_id index size ~ 48MB
tags_id index size ~ 61MB
Migration time ~ 41 sec
The weird migration
ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT
"big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id";
ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT
"big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id" FOREIGN KEY ("tag_id")
REFERENCES "big_tables_tag" ("id") DEFERRABLE INITIALLY DEFERRED;
ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT
"big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id";
ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT
"big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id" FOREIGN KEY
("message_id") REFERENCES "big_tables_message" ("id") DEFERRABLE INITIALLY DEFERRED;
PostgreSQL
Rows ~ 2.8M
Size ~ 83MB
message_id index size ~ 77MB
tags_id index size ~ 119MB
Migration time ~ 3.2 sec
Modify the migration that created the field and add the change there
* It is a know issue https://code.djangoproject.com/ticket/25253
Solution
class MessagesTags(models.Model):
message = models.ForeignKey(Message)
tag = models.ForeignKey(Tag)
added_by = models.ForeignKey(User, null=True)
Adding fields to big tables
MySQL: 31 sec
PostgreSQL: 5.3 sec
Timing
MySQL INPLACE
ALTER TABLE `big_tables_message_tags` ADD COLUMN `added_by_id`
integer NULL, ALGORITHM INPLACE, LOCK NONE;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message_ta_added_by_id_88e3a4dc_fk_big_tables_user_id`
FOREIGN KEY (`added_by_id`) REFERENCES `big_tables_user` (`id`),
ALGORITHM INPLACE, LOCK NONE;
* The INPLACE algorithm is supported when foreign_key_checks is disabled.
Otherwise, only the COPY algorithm is supported.
Running this on prod
Running in on prod resulted in the API crashing
Non locking query but still too heavy for the DB
Aurora appears even slower
Alternative
class MessagesTagsExtend(models.Model):
STATUS_PENDING_REVIEW = 0
STATUS_APPROVED = 10
DEFAULT_STATUS = STATUS_PENDING_REVIEW
message_tag = models.OneToOneField(MessagesTags)
status = models.IntegerField(default=DEFAULT_STATUS)
Alternative
class MessagesTags(models.Model):
...
@property
def status(self):
try:
return self.messagestagsextend.status
except MessagesTagsExtend.DoesNotExist:
print 'here'
return MessagesTagsExtend.DEFAULT_STATUS
@status.setter
def status(self, value):
obj, _ = MessagesTagsExtend.objects.get_or_create(message_tag=self)
obj.status = value
obj.save()
self.messagestagsextend = obj
* Performance is not tested on production environment
Iterating on big tables
for x in MessagesTags.objects.all():
print x
+ Single SQL query
- Loads everything in memory
Iterating on big tables
for x in MessagesTags.objects.iterator():
print x
+ Single SQL query
+ Loads pieces of the result in memory
- prefetch_related is not working
Questions?

Mais conteúdo relacionado

Mais procurados

Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...sachin kumar
 
20190627 j hipster-conf- diary of a java dev lost in the .net world
20190627   j hipster-conf- diary of a java dev lost in the .net world20190627   j hipster-conf- diary of a java dev lost in the .net world
20190627 j hipster-conf- diary of a java dev lost in the .net worldDaniel Petisme
 
XML & XPath Injections
XML & XPath InjectionsXML & XPath Injections
XML & XPath InjectionsAMol NAik
 
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris SavkovOWASP Russia
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)MongoSF
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329Douglas Duncan
 
Indexing & query optimization
Indexing & query optimizationIndexing & query optimization
Indexing & query optimizationJared Rosoff
 
Python PCEP Functions
Python PCEP FunctionsPython PCEP Functions
Python PCEP FunctionsIHTMINSTITUTE
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)MongoDB
 
Jdbc oracle
Jdbc oracleJdbc oracle
Jdbc oracleyazidds2
 
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, futuredelimitry
 
Smarter Testing with Spock
Smarter Testing with SpockSmarter Testing with Spock
Smarter Testing with SpockDmitry Voloshko
 
1.4 data cleaning and manipulation in r and excel
1.4  data cleaning and manipulation in r and excel1.4  data cleaning and manipulation in r and excel
1.4 data cleaning and manipulation in r and excelSimple Research
 

Mais procurados (20)

MySql:Introduction
MySql:IntroductionMySql:Introduction
MySql:Introduction
 
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
 
Mongo indexes
Mongo indexesMongo indexes
Mongo indexes
 
Hacking XPATH 2.0
Hacking XPATH 2.0Hacking XPATH 2.0
Hacking XPATH 2.0
 
20190627 j hipster-conf- diary of a java dev lost in the .net world
20190627   j hipster-conf- diary of a java dev lost in the .net world20190627   j hipster-conf- diary of a java dev lost in the .net world
20190627 j hipster-conf- diary of a java dev lost in the .net world
 
Spock
SpockSpock
Spock
 
XML & XPath Injections
XML & XPath InjectionsXML & XPath Injections
XML & XPath Injections
 
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
 
บทที่4
บทที่4บทที่4
บทที่4
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329
 
Clojure functions midje
Clojure functions midjeClojure functions midje
Clojure functions midje
 
Indexing & query optimization
Indexing & query optimizationIndexing & query optimization
Indexing & query optimization
 
Python PCEP Functions
Python PCEP FunctionsPython PCEP Functions
Python PCEP Functions
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
Jdbc oracle
Jdbc oracleJdbc oracle
Jdbc oracle
 
Sequelize
SequelizeSequelize
Sequelize
 
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, future
 
Smarter Testing with Spock
Smarter Testing with SpockSmarter Testing with Spock
Smarter Testing with Spock
 
1.4 data cleaning and manipulation in r and excel
1.4  data cleaning and manipulation in r and excel1.4  data cleaning and manipulation in r and excel
1.4 data cleaning and manipulation in r and excel
 

Semelhante a Django and working with large database tables

Questions On The Code And Core Module
Questions On The Code And Core ModuleQuestions On The Code And Core Module
Questions On The Code And Core ModuleKatie Gulley
 
concurrency with GPars
concurrency with GParsconcurrency with GPars
concurrency with GParsPaul King
 
More Stored Procedures and MUMPS for DivConq
More Stored Procedures and  MUMPS for DivConqMore Stored Procedures and  MUMPS for DivConq
More Stored Procedures and MUMPS for DivConqeTimeline, LLC
 
Python Metaprogramming
Python MetaprogrammingPython Metaprogramming
Python MetaprogrammingSDU CYBERLAB
 
Clean code _v2003
 Clean code _v2003 Clean code _v2003
Clean code _v2003R696
 
Java → kotlin: Tests Made Simple
Java → kotlin: Tests Made SimpleJava → kotlin: Tests Made Simple
Java → kotlin: Tests Made Simpleleonsabr
 
03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slotsmha4
 
03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slotsmha4
 
[FT-7][snowmantw] How to make a new functional language and make the world be...
[FT-7][snowmantw] How to make a new functional language and make the world be...[FT-7][snowmantw] How to make a new functional language and make the world be...
[FT-7][snowmantw] How to make a new functional language and make the world be...Functional Thursday
 
Building node.js applications with Database Jones
Building node.js applications with Database JonesBuilding node.js applications with Database Jones
Building node.js applications with Database JonesJohn David Duncan
 
Addressing Scenario
Addressing ScenarioAddressing Scenario
Addressing ScenarioTara Hardin
 
GSP 125 Final Exam Guide
GSP 125 Final Exam GuideGSP 125 Final Exam Guide
GSP 125 Final Exam Guidecritter13
 
Faculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docxFaculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docxmydrynan
 
Metaprogramovanie #1
Metaprogramovanie #1Metaprogramovanie #1
Metaprogramovanie #1Jano Suchal
 

Semelhante a Django and working with large database tables (20)

Questions On The Code And Core Module
Questions On The Code And Core ModuleQuestions On The Code And Core Module
Questions On The Code And Core Module
 
Django Good Practices
Django Good PracticesDjango Good Practices
Django Good Practices
 
concurrency with GPars
concurrency with GParsconcurrency with GPars
concurrency with GPars
 
More Stored Procedures and MUMPS for DivConq
More Stored Procedures and  MUMPS for DivConqMore Stored Procedures and  MUMPS for DivConq
More Stored Procedures and MUMPS for DivConq
 
Python Metaprogramming
Python MetaprogrammingPython Metaprogramming
Python Metaprogramming
 
Clean code _v2003
 Clean code _v2003 Clean code _v2003
Clean code _v2003
 
Django Models
Django ModelsDjango Models
Django Models
 
Data herding
Data herdingData herding
Data herding
 
Data herding
Data herdingData herding
Data herding
 
Java → kotlin: Tests Made Simple
Java → kotlin: Tests Made SimpleJava → kotlin: Tests Made Simple
Java → kotlin: Tests Made Simple
 
Why Our Code Smells
Why Our Code SmellsWhy Our Code Smells
Why Our Code Smells
 
03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots
 
03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots
 
Clean code
Clean codeClean code
Clean code
 
[FT-7][snowmantw] How to make a new functional language and make the world be...
[FT-7][snowmantw] How to make a new functional language and make the world be...[FT-7][snowmantw] How to make a new functional language and make the world be...
[FT-7][snowmantw] How to make a new functional language and make the world be...
 
Building node.js applications with Database Jones
Building node.js applications with Database JonesBuilding node.js applications with Database Jones
Building node.js applications with Database Jones
 
Addressing Scenario
Addressing ScenarioAddressing Scenario
Addressing Scenario
 
GSP 125 Final Exam Guide
GSP 125 Final Exam GuideGSP 125 Final Exam Guide
GSP 125 Final Exam Guide
 
Faculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docxFaculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docx
 
Metaprogramovanie #1
Metaprogramovanie #1Metaprogramovanie #1
Metaprogramovanie #1
 

Último

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 

Último (20)

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 

Django and working with large database tables

  • 1. Django and working with large database tables Django Stockholm Meetup Group March 30, 2017
  • 2. About me Ilian Iliev Platform Engineer at Lifesum ilian@ilian.io www.ilian.io
  • 3. The setup 2.5GHz i7, 16GB Ram, MacBook Pro Django 1.10 MySQL 5.7.14 PostgreSQL 9.5.4
  • 4. The Models class Tag(models.Model): name = models.CharField(max_length=255) class User(models.Model): name = models.CharField(max_length=255) date = models.DateTimeField(null=True) class Message(models.Model): sender = models.ForeignKey(User, related_name='sent_messages') receiver = models.ForeignKey(User, related_name='recieved_messages', null=True) tags = models.ManyToManyField(Tag)
  • 5. The Change class Message(models.Model): sender = models.ForeignKey(User, related_name='sent_messages') receiver = models.ForeignKey(User, related_name='recieved_messages', null=True) tags = models.ManyToManyField(Tag, blank=True)
  • 6. The weird migration ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY `big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id`; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id` FOREIGN KEY (`tag_id`) REFERENCES `big_tables_tag` (`id`); ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY `big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id`; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id` FOREIGN KEY (`message_id`) REFERENCES `big_tables_message` (`id`);
  • 7. MySQL Rows ~ 2.7M Size ~ 88MB message_id index size ~ 48MB tags_id index size ~ 61MB Migration time ~ 41 sec
  • 8. The weird migration ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT "big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id"; ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT "big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id" FOREIGN KEY ("tag_id") REFERENCES "big_tables_tag" ("id") DEFERRABLE INITIALLY DEFERRED; ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT "big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id"; ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT "big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id" FOREIGN KEY ("message_id") REFERENCES "big_tables_message" ("id") DEFERRABLE INITIALLY DEFERRED;
  • 9. PostgreSQL Rows ~ 2.8M Size ~ 83MB message_id index size ~ 77MB tags_id index size ~ 119MB Migration time ~ 3.2 sec
  • 10. Modify the migration that created the field and add the change there * It is a know issue https://code.djangoproject.com/ticket/25253 Solution
  • 11. class MessagesTags(models.Model): message = models.ForeignKey(Message) tag = models.ForeignKey(Tag) added_by = models.ForeignKey(User, null=True) Adding fields to big tables
  • 12. MySQL: 31 sec PostgreSQL: 5.3 sec Timing
  • 13. MySQL INPLACE ALTER TABLE `big_tables_message_tags` ADD COLUMN `added_by_id` integer NULL, ALGORITHM INPLACE, LOCK NONE; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message_ta_added_by_id_88e3a4dc_fk_big_tables_user_id` FOREIGN KEY (`added_by_id`) REFERENCES `big_tables_user` (`id`), ALGORITHM INPLACE, LOCK NONE; * The INPLACE algorithm is supported when foreign_key_checks is disabled. Otherwise, only the COPY algorithm is supported.
  • 14. Running this on prod Running in on prod resulted in the API crashing Non locking query but still too heavy for the DB Aurora appears even slower
  • 15. Alternative class MessagesTagsExtend(models.Model): STATUS_PENDING_REVIEW = 0 STATUS_APPROVED = 10 DEFAULT_STATUS = STATUS_PENDING_REVIEW message_tag = models.OneToOneField(MessagesTags) status = models.IntegerField(default=DEFAULT_STATUS)
  • 16. Alternative class MessagesTags(models.Model): ... @property def status(self): try: return self.messagestagsextend.status except MessagesTagsExtend.DoesNotExist: print 'here' return MessagesTagsExtend.DEFAULT_STATUS @status.setter def status(self, value): obj, _ = MessagesTagsExtend.objects.get_or_create(message_tag=self) obj.status = value obj.save() self.messagestagsextend = obj * Performance is not tested on production environment
  • 17. Iterating on big tables for x in MessagesTags.objects.all(): print x + Single SQL query - Loads everything in memory
  • 18. Iterating on big tables for x in MessagesTags.objects.iterator(): print x + Single SQL query + Loads pieces of the result in memory - prefetch_related is not working

Notas do Editor

  1. How many of you use MySQL How many use PostgreSQL Anyone using SQLite or Oracle?
  2. And of course you will always have to add select related
  3. Single SQL query Loads everything in memory I killed it after taking 2G of ram and it still hasn’t started printing the results
  4. And of course you will always have to add select related Consider using values() and values_list()
  5. Thank you a for listening, do you have any questions.