SlideShare uma empresa Scribd logo
1 de 47
Baixar para ler offline
How we replaced 10 years of legacy 	

with Django & AWS
Contents
• Background stuff -- who are we and why
are we doing what we do?	

• Technology bits -- what are we doing?	

• Processes -- how are we doing it?
Background stuff
Technology bits
Processes
Who are we?
• Kaleva Oy	

• Founded in 1899	

• 520 employees	

• Kaleva -- the printed newspaper	

• 192k readers every day	

• 4th largest 7-days/week
newspaper in Finland	

• commercial print
Okay, but who are we?
• Kaleva Digital Business 	

• separated as its own unit a couple of years ago	

• 4 developers, 1 graphic designer	

• bunch of sales people and managers	

• “startup” within the company	

• kaleva.fi	

• ~190k unique visitors/week	

• ~10M page views per month	

• 75% of users visit at least once a day	

• 85% of uses come from Oulu Province
*
Not just kaleva.fi
• our unit has huge growth goals	

• not possible with just the news portal	

• completely new services in the works	

• openness (data, services, code, ideas)	

• prototyping
Replacing legacy
kaleva.fi
What did we replace?
• 10 years of legacy ColdFusion code 	

• Development practices had remained almost the same the whole
time	

• mostly everything was done ad hoc	

• no version control	

• no automated testing	

• no test environments and/or DBs --development directly in
production	

• no documentation	

• A lot of much silent information concentrated on a few people
Kaleva.fi - rich platform
• news, mobile site, blogs, reviews, comments, events
calendar, discussion forums, reader photos, videos, photo
galleries, cartoons, mini sites, polls, competitions, weather...	

• sister publications: Kiekko-Kaleva, Hightech Forum	

• loads of customized tools for journalists and other
content providers	

• plethora of integrations: content feeds from print
newspaper, Kaleva’s photo journalists, STT-Lehtikuva,
European Photo Agency, SMS/MMS gateways plus
numerous other partners
What did we want?
• Agility, reusability and scalability	

• Processes	

• Tools	

• Infrastructure	

• Code	

• People	

• From “idea land” to working prototype in one spike	

• Metrics vs. opinions
Choosing the core
• existing CMS or build our own?	

• independence from vendor-lock in	

• not having to change the way journalists do their work	

• we’re not just building a news portal	

• new services that have very little in common with Kaleva.fi	

• reusability, common practices (development, packaging,
deployment, ops...)	

• how not to build every feature from scratch?	

• how to rewrite what we need fast enough and with good quality?	

• avoiding feature creep -- how not to end up with a neverending
“catch me if you can” with the legacy service?
Django to the rescue!
• Has CMS-like functionality built-in that can we could use	

• Design principles guide towards a consistent SW
architecture	

• Fast to develop	

• Proven & scalable	

• Chicago Tribune,The Onion, NewYork Times...	

• Disqus, Instagram, Pinterest, Google	

• Strong community	

• Python <3
Kaleva.fi v2
- what did we build?
Technology bits
Elastic Load Balancer
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (master)
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (slave)
Availability zone Availability zone
Streaming replication
S3
WAL archives
Periodic full backups
Offsite backups
S3 (static files)
Elastic Load Balancer
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (master)
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (slave)
Availability zone Availability zone
Streaming replication
S3
WAL archives
Periodic full backups
Offsite backups
S3 (static files)
Elastic Load Balancer
• Balances load toVarnish proxies in different AZs	

• One virtual instance, scales automatically depending on the load	

• Capable of SSL offloading	

• There’s a catch: 	

• you cannot point to an IP so A records are out of the picture	

• If you want naked domains (e.g. domain.com), you have some
choices:	

• Route53	

• Redirects
Elastic Load Balancer
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (master)
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (slave)
Availability zone Availability zone
Streaming replication
S3
WAL archives
Periodic full backups
Offsite backups
S3 (static files)
Varnish
• A must-have for read-heavy sites	

• Most challenges in news portals are related the giant
user peaks when something worth noticing has
happened	

• shootouts, natural disasters etc.	

• Ideal forVarnish: few unique pages, loads of users	

• Even in normal day, most users hit the front page
and a reasonable number of “hottest news” of the
moment
Varnish
• Average cache hit rate ~80%	

• CPU load insignificant	

• Works very well with default configurations	

• Do not store your cache files in EBS, instead use
RAM :)	

• Caching non-personalized content is easy	

• E.g. cookies (think of CSRF tokens) may need some
extra work	

• While caching is easy, invalidation is harder
Varnish
• Currently, in most scenarios we use a somewhat naive
TTL-based cache strategy	

• Origin servers return max_age headers, which
Varnish obeys	

• Can be overriden byVarnish	

• Key is to also use the built-in cache infrastructure in
internet: allow any intermediate proxies and the
User Agent to cache the content	

• Grace period allowsVarnish to serve stale content
when fetching updated versions from backends
Varnish
• Caching and static files	

• version x of site:

<link rel="stylesheet" type="text/css" href="kalevafi-min.css" />
• Response headers:

[...]

Cache-Control: max-age=2592000

[...]	

• Changes to the CSS will be visible to users after 30 days	

• File versioning to the rescue!	

• On deployment, append MD5 sum to the names of static files: 

<link rel="stylesheet" type="text/css" href="kalevafi-min--beac47.css" />
• Nginx ignores the MD5 sum	

• Checksum only changes when the static file has changed	

• Hide the complexity to e.g. a custom templatetag	

• We’re still using Django 1.3.x, might be able to move from custom
implementation to Django’s static file versioning in 1.4
Elastic Load Balancer
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (master)
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (slave)
Availability zone Availability zone
Streaming replication
S3
WAL archives
Periodic full backups
Offsite backups
S3 (static files)
nginx + Django
• nginx -> FastCGI -> Django	

• Simple throwaway work horses	

• Easy to scale (Autoscaling, yay!)	

• CPU-heavy	

• Currently “fast enough”	

• intuition says should be faster, we’ve got profiling to do	

• we want to see how uWSGI, Gunicorn or perhaps Jinja2 would perform
under live load	

• easy because new instances can be launched within minutes	

• Nginx handles compression of content	

• gzip_vary makes it work withVarnish (Vary: Accept-Encoding)
Elastic Load Balancer
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (master)
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (slave)
Availability zone Availability zone
Streaming replication
S3
WAL archives
Periodic full backups
Offsite backups
S3 (static files)
Memcached
• Each availability zone has two Memcached nodes (failure/memory corruption
within AZ only affects 1/2 of the keyspace local to the zone)	

• We’re discussing about moving to Amazon ElastiCache (protocol-
compatible with Memcached)	

• Precalculated things (e.g. counts, which are very slow with PostgreSQL)	

• Generally do whatever you can to reduce amount of queries hitting the
database	

• Again, we mostly use a naive TTL-based eviction policy	

• not trivial to co-ordinate TTLs betweenVarnish and Memcached optimally	

• The way to handle peak traffic if requests do not hitVarnish cache	

• Effective for contents that are common to several different pages
Elastic Load Balancer
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (master)
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (slave)
Availability zone Availability zone
Streaming replication
S3
WAL archives
Periodic full backups
Offsite backups
S3 (static files)
RabbitMQ
• Celery + RabbitMQ for asynchronous processing	

• For example, we use it for automatically scaling user uploaded
images	

• file is uploaded and the resize job is put to the queue	

• once we have a spare worker, the job is carried out	

• Scales well, integrates nicely with Django	

• Can be used as a buffer e.g. against latency spikes in DB	

• perform DB writes in batches -> peak commit frequency
drops to a fraction	

• your DB doesn’t have to scale 1:1 for write ops from users
Elastic Load Balancer
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (master)
Varnish proxy
nginx +
FastCGI/Django
Memcached RabbitMQ
PostgreSQL (slave)
Availability zone Availability zone
Streaming replication
S3
WAL archives
Periodic full backups
Offsite backups
S3 (static files)
PostgreSQL
• Reliable, proper RDBMS for core features	

• Open source, not depending on corporate strategies	

• PostGIS+GeoDjango for location-aware features	

• NoSQL could be beneficial in certain write-heavy
components	

• in our scale, PostgreSQL (+queues) still does its job
well enough	

• more technology == more things that can break ==
more time spent doing ops stuff instead of
developing
PostgreSQL and AWS
• Very challenging to build reliable and reasonably performing DB in AWS	

• EBS is slow	

• not just slow but performance varies a lot	

• shares NIC with your instance...	

• get the largest instance type you can afford	

• assume max. 100 iops per volume	

• Our solution: RAID 10 (8 volumes mirrored and striped + 1 spare) + trying to fit as much
in RAM as possible	

• Everything can fail -- instances, EBS platform...	

• usually local to an availability zone	

• multifails have happened (EBS fails in one AZ -> cannot launch new instances to other AZs)	

• Master-slave configuration a must	

• Streaming replication between multiple availability zones	

• WAL archiving to S3
Amazon Web Services
• Initially seems easy	

• Soon, you’ll notice it’s actually very hard	

• With elasticity comes the need to take control of the
ever-changing configuration	

• Amazon provides building blocks for your
infrastructure, how you manage it is up to you	

• Solve this as soon as you can	

• Learn from others mistakes	

• Eventually, (most) things become easy again
How are we doing it?
Processes
DevOps culture
• How to make zero downtime deployments?	

• During peak hours?	

• Think how your features can be deployed when
you design/develop them	

• Feel the pain of having to fix production problems,
you do your best to try to prevent problems	

• Fear of a phone call in the middle of the night is
an effective motivator :)
DevOps culture
• It grows from simple things	

• We have a rolling “Janitor” role -- each
developer gets to be the Janitor for one
sprint at a time	

• primary person to handle deployments,
monitoring etc.	

• gives time for the rest of the team to focus
on their backlog items
Infrastructure as a code
• The best way to document how your servers can be rebuilt is to have it in
your version control	

• Server configurations are part of your application and they do change in time	

• We tried to use Fabric for managing configuration	

• didn’t believe when everyone on the web said it’s a mistake	

• Fabric is a great for SSH automation but not for configuration
management	

• We’ve started replacing the Fabric-managed nodes with Chef +
CloudFormation managed nodes	

• Painful to get started, but gets better	

• Amazon provides a sample CloudFormation template to setup Chef
server + infra for validation.pem provisioning	

• Promising push-based newcomer: http://ansible.github.com/
Continuous Delivery
• All changes in version control gets deployed in CI	

• Automated acceptance tests in CI	

• If green, it’s a potential QA release	

• User Acceptance Tests in QA	

• Automate tests especially for things that 	

• change a lot	

• break a lot	

• are difficult/laborious to test manually	

• Code coverage is not a particularly good way to measure
automated test quality
Deployments
• Goal: low-risk, zero downtime releases during office hours	

• Decouple deployments from launching new features	

• Use feature toggles to enable/disable features	

• For Django, Gargoyle by Disqus does the trick very well	

• Canary releases -- deploy only to a certain group of users (e.g. % of
user base, users coming from a certain IP block etc) and monitor
results	

• monitor also also business metrics: did the feature have the
effect we wanted?	

• if a feature appears to cause issues, disable it and have it fixed
Dark launches
!
• Add a new feature, e.g. a widget on front page but do
not yet show it to users (just make it do whatever it is
supposed to do)	

• Measure impact (e.g. performance) without the users
noticing something might be wrong	

• Again, enable first for certain amount of users and if
you notice problems, disable the feature	

• Once you’ve made sure everything works smoothly,
unhide the widget
The tricky parts?
Packaging Django
• Deploying Django is a pain	

• Pip + virtualenv is great in theory	

• In practice, there are too many issues:	

• Pypi down or slow (you’ll need your own Pypi
mirror)	

• 10 minutes/server for building virtualenv	

• Each server needs to have C compiler + bunch of -
devel packages -- stuff that doesn’t belong to
production machines!
Packaging Django
• We prepare a RPM package on CI build server	

• Includes virtualenv (not yet using --relocatable due
to issues with it)	

• RPM packaging easy to automate with fpm (https://
github.com/jordansissel/fpm)	

• The RPM package is installed in all environments from an
internalYum repository	

• Easy to manage -- just like any other application	

• Once installed, update symlinks & restart FastCGI and we’re
live
Packaging Django
• The whole deployment process is currently
automated with Fabric	

• Once command can do everything:	

• rolling update, one availability zone at a time	

• attaches/detaches nodes from Elastic Load Balancer	

• automatically handle scheduled downtime in Nagios	

• The whole stack is updated in a few minutes
Data model changes
• Simple solution: avoid them	

• If not possible to avoid, use patterns such as expand/contract:	

• Add a new (NULLable) field	

• Lazy migration: on user request try to read from new field,
if data is not present, fill it	

• Once almost everything is migrated, run a batch update on
rest	

• Finally, enforce constraints (NOT NULL etc)	

• Backwards compatibility critical
Data model changes
• Decoupling database changes and SW difficult with ORM +
South	

• Locking issues	

• PostgreSQL requires exclusive locks when it creates foreign
keys	

• Example: we use Generic Foreign Keys and
django_content_type extensively for linking content
to other content	

• Adding new fields with FK to django_content_type
are not possible if there are long-running transactions having
any locks on the django_content_type_id
Django ORM
• Sometimes you may need to use raw SQL to trivialize your problems	

• Trees are traditionally nasty for relational DBs	

• news sections are a tree structure and we often need to fetch data
from current section and all its children and parent recursively	

• Pretty much the bread and butter kind of query type that occurs in
several parts of the site for various types of content	

• We did it first with Django ORM	

• In worst cases, it took several seconds to execute per query	

• Replaced it with raw SQL that uses PostgreSQL’s WITH
RECURSIVE and the query does its job in milliseconds	

• It’s great to know you always have the choice of raw SQL
Summary
• Django was critical for our success and we
strongly believe it will enable us to do great
things also in the future	

• Deploying Django is hard	

• Doing things right in the cloud is hard	

• Not saying we’re doing everything right, but
we’re going there	

• But luckily we like challenges!
Questions?
Thank you
@raveli	

markus.syrjanen@kaleva.fi

Mais conteúdo relacionado

Último

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 

Último (20)

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 

Destaque

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Destaque (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

Kaleva.fi: How we replaced 10 years of Legacy with Django and AWS

  • 1. How we replaced 10 years of legacy with Django & AWS
  • 2. Contents • Background stuff -- who are we and why are we doing what we do? • Technology bits -- what are we doing? • Processes -- how are we doing it? Background stuff Technology bits Processes
  • 3. Who are we? • Kaleva Oy • Founded in 1899 • 520 employees • Kaleva -- the printed newspaper • 192k readers every day • 4th largest 7-days/week newspaper in Finland • commercial print
  • 4. Okay, but who are we? • Kaleva Digital Business • separated as its own unit a couple of years ago • 4 developers, 1 graphic designer • bunch of sales people and managers • “startup” within the company • kaleva.fi • ~190k unique visitors/week • ~10M page views per month • 75% of users visit at least once a day • 85% of uses come from Oulu Province *
  • 5. Not just kaleva.fi • our unit has huge growth goals • not possible with just the news portal • completely new services in the works • openness (data, services, code, ideas) • prototyping
  • 7. What did we replace? • 10 years of legacy ColdFusion code • Development practices had remained almost the same the whole time • mostly everything was done ad hoc • no version control • no automated testing • no test environments and/or DBs --development directly in production • no documentation • A lot of much silent information concentrated on a few people
  • 8. Kaleva.fi - rich platform • news, mobile site, blogs, reviews, comments, events calendar, discussion forums, reader photos, videos, photo galleries, cartoons, mini sites, polls, competitions, weather... • sister publications: Kiekko-Kaleva, Hightech Forum • loads of customized tools for journalists and other content providers • plethora of integrations: content feeds from print newspaper, Kaleva’s photo journalists, STT-Lehtikuva, European Photo Agency, SMS/MMS gateways plus numerous other partners
  • 9. What did we want? • Agility, reusability and scalability • Processes • Tools • Infrastructure • Code • People • From “idea land” to working prototype in one spike • Metrics vs. opinions
  • 10. Choosing the core • existing CMS or build our own? • independence from vendor-lock in • not having to change the way journalists do their work • we’re not just building a news portal • new services that have very little in common with Kaleva.fi • reusability, common practices (development, packaging, deployment, ops...) • how not to build every feature from scratch? • how to rewrite what we need fast enough and with good quality? • avoiding feature creep -- how not to end up with a neverending “catch me if you can” with the legacy service?
  • 11. Django to the rescue! • Has CMS-like functionality built-in that can we could use • Design principles guide towards a consistent SW architecture • Fast to develop • Proven & scalable • Chicago Tribune,The Onion, NewYork Times... • Disqus, Instagram, Pinterest, Google • Strong community • Python <3
  • 12. Kaleva.fi v2 - what did we build? Technology bits
  • 13. Elastic Load Balancer Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (master) Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (slave) Availability zone Availability zone Streaming replication S3 WAL archives Periodic full backups Offsite backups S3 (static files)
  • 14. Elastic Load Balancer Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (master) Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (slave) Availability zone Availability zone Streaming replication S3 WAL archives Periodic full backups Offsite backups S3 (static files)
  • 15. Elastic Load Balancer • Balances load toVarnish proxies in different AZs • One virtual instance, scales automatically depending on the load • Capable of SSL offloading • There’s a catch: • you cannot point to an IP so A records are out of the picture • If you want naked domains (e.g. domain.com), you have some choices: • Route53 • Redirects
  • 16. Elastic Load Balancer Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (master) Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (slave) Availability zone Availability zone Streaming replication S3 WAL archives Periodic full backups Offsite backups S3 (static files)
  • 17. Varnish • A must-have for read-heavy sites • Most challenges in news portals are related the giant user peaks when something worth noticing has happened • shootouts, natural disasters etc. • Ideal forVarnish: few unique pages, loads of users • Even in normal day, most users hit the front page and a reasonable number of “hottest news” of the moment
  • 18. Varnish • Average cache hit rate ~80% • CPU load insignificant • Works very well with default configurations • Do not store your cache files in EBS, instead use RAM :) • Caching non-personalized content is easy • E.g. cookies (think of CSRF tokens) may need some extra work • While caching is easy, invalidation is harder
  • 19. Varnish • Currently, in most scenarios we use a somewhat naive TTL-based cache strategy • Origin servers return max_age headers, which Varnish obeys • Can be overriden byVarnish • Key is to also use the built-in cache infrastructure in internet: allow any intermediate proxies and the User Agent to cache the content • Grace period allowsVarnish to serve stale content when fetching updated versions from backends
  • 20. Varnish • Caching and static files • version x of site:
 <link rel="stylesheet" type="text/css" href="kalevafi-min.css" /> • Response headers:
 [...]
 Cache-Control: max-age=2592000
 [...] • Changes to the CSS will be visible to users after 30 days • File versioning to the rescue! • On deployment, append MD5 sum to the names of static files: 
 <link rel="stylesheet" type="text/css" href="kalevafi-min--beac47.css" /> • Nginx ignores the MD5 sum • Checksum only changes when the static file has changed • Hide the complexity to e.g. a custom templatetag • We’re still using Django 1.3.x, might be able to move from custom implementation to Django’s static file versioning in 1.4
  • 21. Elastic Load Balancer Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (master) Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (slave) Availability zone Availability zone Streaming replication S3 WAL archives Periodic full backups Offsite backups S3 (static files)
  • 22. nginx + Django • nginx -> FastCGI -> Django • Simple throwaway work horses • Easy to scale (Autoscaling, yay!) • CPU-heavy • Currently “fast enough” • intuition says should be faster, we’ve got profiling to do • we want to see how uWSGI, Gunicorn or perhaps Jinja2 would perform under live load • easy because new instances can be launched within minutes • Nginx handles compression of content • gzip_vary makes it work withVarnish (Vary: Accept-Encoding)
  • 23. Elastic Load Balancer Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (master) Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (slave) Availability zone Availability zone Streaming replication S3 WAL archives Periodic full backups Offsite backups S3 (static files)
  • 24. Memcached • Each availability zone has two Memcached nodes (failure/memory corruption within AZ only affects 1/2 of the keyspace local to the zone) • We’re discussing about moving to Amazon ElastiCache (protocol- compatible with Memcached) • Precalculated things (e.g. counts, which are very slow with PostgreSQL) • Generally do whatever you can to reduce amount of queries hitting the database • Again, we mostly use a naive TTL-based eviction policy • not trivial to co-ordinate TTLs betweenVarnish and Memcached optimally • The way to handle peak traffic if requests do not hitVarnish cache • Effective for contents that are common to several different pages
  • 25. Elastic Load Balancer Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (master) Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (slave) Availability zone Availability zone Streaming replication S3 WAL archives Periodic full backups Offsite backups S3 (static files)
  • 26. RabbitMQ • Celery + RabbitMQ for asynchronous processing • For example, we use it for automatically scaling user uploaded images • file is uploaded and the resize job is put to the queue • once we have a spare worker, the job is carried out • Scales well, integrates nicely with Django • Can be used as a buffer e.g. against latency spikes in DB • perform DB writes in batches -> peak commit frequency drops to a fraction • your DB doesn’t have to scale 1:1 for write ops from users
  • 27. Elastic Load Balancer Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (master) Varnish proxy nginx + FastCGI/Django Memcached RabbitMQ PostgreSQL (slave) Availability zone Availability zone Streaming replication S3 WAL archives Periodic full backups Offsite backups S3 (static files)
  • 28. PostgreSQL • Reliable, proper RDBMS for core features • Open source, not depending on corporate strategies • PostGIS+GeoDjango for location-aware features • NoSQL could be beneficial in certain write-heavy components • in our scale, PostgreSQL (+queues) still does its job well enough • more technology == more things that can break == more time spent doing ops stuff instead of developing
  • 29. PostgreSQL and AWS • Very challenging to build reliable and reasonably performing DB in AWS • EBS is slow • not just slow but performance varies a lot • shares NIC with your instance... • get the largest instance type you can afford • assume max. 100 iops per volume • Our solution: RAID 10 (8 volumes mirrored and striped + 1 spare) + trying to fit as much in RAM as possible • Everything can fail -- instances, EBS platform... • usually local to an availability zone • multifails have happened (EBS fails in one AZ -> cannot launch new instances to other AZs) • Master-slave configuration a must • Streaming replication between multiple availability zones • WAL archiving to S3
  • 30. Amazon Web Services • Initially seems easy • Soon, you’ll notice it’s actually very hard • With elasticity comes the need to take control of the ever-changing configuration • Amazon provides building blocks for your infrastructure, how you manage it is up to you • Solve this as soon as you can • Learn from others mistakes • Eventually, (most) things become easy again
  • 31. How are we doing it? Processes
  • 32. DevOps culture • How to make zero downtime deployments? • During peak hours? • Think how your features can be deployed when you design/develop them • Feel the pain of having to fix production problems, you do your best to try to prevent problems • Fear of a phone call in the middle of the night is an effective motivator :)
  • 33. DevOps culture • It grows from simple things • We have a rolling “Janitor” role -- each developer gets to be the Janitor for one sprint at a time • primary person to handle deployments, monitoring etc. • gives time for the rest of the team to focus on their backlog items
  • 34. Infrastructure as a code • The best way to document how your servers can be rebuilt is to have it in your version control • Server configurations are part of your application and they do change in time • We tried to use Fabric for managing configuration • didn’t believe when everyone on the web said it’s a mistake • Fabric is a great for SSH automation but not for configuration management • We’ve started replacing the Fabric-managed nodes with Chef + CloudFormation managed nodes • Painful to get started, but gets better • Amazon provides a sample CloudFormation template to setup Chef server + infra for validation.pem provisioning • Promising push-based newcomer: http://ansible.github.com/
  • 35. Continuous Delivery • All changes in version control gets deployed in CI • Automated acceptance tests in CI • If green, it’s a potential QA release • User Acceptance Tests in QA • Automate tests especially for things that • change a lot • break a lot • are difficult/laborious to test manually • Code coverage is not a particularly good way to measure automated test quality
  • 36. Deployments • Goal: low-risk, zero downtime releases during office hours • Decouple deployments from launching new features • Use feature toggles to enable/disable features • For Django, Gargoyle by Disqus does the trick very well • Canary releases -- deploy only to a certain group of users (e.g. % of user base, users coming from a certain IP block etc) and monitor results • monitor also also business metrics: did the feature have the effect we wanted? • if a feature appears to cause issues, disable it and have it fixed
  • 37. Dark launches ! • Add a new feature, e.g. a widget on front page but do not yet show it to users (just make it do whatever it is supposed to do) • Measure impact (e.g. performance) without the users noticing something might be wrong • Again, enable first for certain amount of users and if you notice problems, disable the feature • Once you’ve made sure everything works smoothly, unhide the widget
  • 39. Packaging Django • Deploying Django is a pain • Pip + virtualenv is great in theory • In practice, there are too many issues: • Pypi down or slow (you’ll need your own Pypi mirror) • 10 minutes/server for building virtualenv • Each server needs to have C compiler + bunch of - devel packages -- stuff that doesn’t belong to production machines!
  • 40. Packaging Django • We prepare a RPM package on CI build server • Includes virtualenv (not yet using --relocatable due to issues with it) • RPM packaging easy to automate with fpm (https:// github.com/jordansissel/fpm) • The RPM package is installed in all environments from an internalYum repository • Easy to manage -- just like any other application • Once installed, update symlinks & restart FastCGI and we’re live
  • 41. Packaging Django • The whole deployment process is currently automated with Fabric • Once command can do everything: • rolling update, one availability zone at a time • attaches/detaches nodes from Elastic Load Balancer • automatically handle scheduled downtime in Nagios • The whole stack is updated in a few minutes
  • 42. Data model changes • Simple solution: avoid them • If not possible to avoid, use patterns such as expand/contract: • Add a new (NULLable) field • Lazy migration: on user request try to read from new field, if data is not present, fill it • Once almost everything is migrated, run a batch update on rest • Finally, enforce constraints (NOT NULL etc) • Backwards compatibility critical
  • 43. Data model changes • Decoupling database changes and SW difficult with ORM + South • Locking issues • PostgreSQL requires exclusive locks when it creates foreign keys • Example: we use Generic Foreign Keys and django_content_type extensively for linking content to other content • Adding new fields with FK to django_content_type are not possible if there are long-running transactions having any locks on the django_content_type_id
  • 44. Django ORM • Sometimes you may need to use raw SQL to trivialize your problems • Trees are traditionally nasty for relational DBs • news sections are a tree structure and we often need to fetch data from current section and all its children and parent recursively • Pretty much the bread and butter kind of query type that occurs in several parts of the site for various types of content • We did it first with Django ORM • In worst cases, it took several seconds to execute per query • Replaced it with raw SQL that uses PostgreSQL’s WITH RECURSIVE and the query does its job in milliseconds • It’s great to know you always have the choice of raw SQL
  • 45. Summary • Django was critical for our success and we strongly believe it will enable us to do great things also in the future • Deploying Django is hard • Doing things right in the cloud is hard • Not saying we’re doing everything right, but we’re going there • But luckily we like challenges!