SlideShare uma empresa Scribd logo
1 de 21
Going Big: Scalability
Who am I?
• Chris Miller

• Huffington Post - Senior Developer
• CMS platform and API

• Started in systems/network admin before code
What is Huffington Post?
• #87 most popular site in the world (Alexa)
• #3 most popular news site in world (Alexa)
• #19 most popular US site (Alexa)

• More traffic than nytimes.com
Our Platform: Today
• Everything! No, really.

•   Perl: CMS core
•   PHP “layer” integrated on top of Perl code
•   MySQL data storage
•   MongoDB for comments storage
•   Hadoop for internal statistical analysis
•   Memcache for lightweight caching
•   Redis for more structured data types
•   Varnish for caching!
Our Platform: Tomorrow
• Re-think tools and platform from ground up
• Building new API
   – Yes, OAuth 2.0!
   – Complete REST approach
   – Will be public!
• We can’t re-write everything at once, so the API build has 4
  phases:
   –   Build “bridge” middleware to allow access to existing functionality
   –   Refactor backend edit/admin tools
   –   Refactor frontend to use API
   –   Transparently, and calmly, refactor old code while maintaining API
       interfaces
So what about CI?
• New API is built on CodeIgniter
  – Using Phil’s REST library as a starting point
     • Thanks Phil!


• Backend editorial tools are being built on CI

• We love CI
  – But it isn’t our only framework
  – Different tools work better for different teams
  – We use what works. You should too.
How we scale
• CDN: Akamai
     • 80%+ hit rate
     • Amazon S3 for origin of static files
• Basic page layout/content is generated to flat file
     • These contain some dynamic content, in PHP
     • By having the basic page as a flat file, it's less overhead to
       load
     • It also means for certain changes, we have to "regenerate"
       the page. Ugh.
Varnish
• HTTP caching reverse proxy (“HTTP Accelerator”)
• Caching layer in front of your web server

• Stores complete responses in memory
• If request exists, serves from memory
   – Otherwise, forwards to web server, and then caches

• Works nicely with Linux Kernel to delegate memory
  allocation and management to the OS, where it
  belongs
Controlling Varnish
• Set custom TTLs for content:
if (beresp.http.X-HP-Cache-Control ~ "s-maxage") {

    set beresp.http.X-HP-Cache-Control = regsub(beresp.http.X-HP-Cache-Control, "^.*s-maxage=([0-9]+).*", "1");

    // set the ttl.
    C{
        char *ttl;
        ttl = VRT_GetHdr(sp, HDR_BERESP, "023X-HP-Cache-Control:");
        VRT_l_beresp_ttl(sp, atoi(ttl));
    }C
    set beresp.http.X-Cacheable = "CUSTOM: " + beresp.ttl ;

} elsif (beresp.http.X-HP-Cache-Control ~ "(no-cache|private)" || beresp.http.pragma ~ "no-cache") {

    set beresp.ttl = 0s;
    set beresp.http.X-Cacheable = "NO-CACHE";

} else {

    set beresp.http.X-Cacheable = "DEFAULT: 30s";
    set beresp.ttl = 30s;

}
Controlling Varnish
• Refreshing content

sub process_refresh_requests {

    if (req.request == "REFRESH") {
        set req.request = "GET";
        set req.hash_always_miss = true;
    }

}


• This is invoked early in the vcl_recvvcl_recv method
Edge Side Includes
• Include cached content blocks into pages

<html>
<body>

<esi:include
   src="http://example.com/my_page1.html”
   alt="http://example.com/my_page2.html"
   onerror="continue”
/>

</body>
</html>
Edge Side Includes
• How to use ESI:
  – Make complicated blocks independently-
    accessible URIs
  – Create a “template” file with ESI includes to bring
    the page together
• Why this is powerful
  – If multiple pages use different combinations of
    page components, some may already be cached
  – Reduces amount of times entire page must be
    served; Serve only components needed
Varnish Tricks
• Intelligently purge the cache when your
  content changes
  – Allows you to increase TTL without fear of caching
    outdated content
     if (req.request == "PURGE") {
         if (!client.ip ~ purgers) {
             error 405 "Method not allowed";
         }
         return (lookup);
     }
Other Scaling Tips
• Hardware SSL offloading is your friend
• Consider mod_php
  – CGI has huge overhead
  – CGI/SuExec has huge security advantages
  – FastCGI is a happy-medium for some
Other Scaling Tips
• Don’t try to do everything on one
  server/cluster
  – Splitting your application is ok
  – 1 cluster for frontend, 1 server/cluster for backend, etc.


• Keep an open mind about technologies,
  platforms, and tools
One More Thing…
   (sorry, I couldn’t resist)
Guilds!
• What a guild is:
   – Groups of people around a topic
   – Membership/participating is encouraged, but not
     required
   – Think of it as an internal Meetup

• Join to learn new things
• Join to talk about things you are interested in

• Examples: PHP, Front End, Python, Ruby,
  Management, Platform/Architecture, Big Data,
  etc…
Guilds!
• Experts to solve technology-specific problems
  – Example: Front-end swat team to improve page load
    time due to slow/too much JS


• Collectively give back to the community around
  your technology

• Help others learn, and learn from others

• Meet people on other teams
Guilds!
• Try it out
¿Preguntas?

Questions?

Perguntas?
Chris Miller

chris.miller@huffingtonpost.com

           @ee99ee



   (P.S. – We’re hiring in NYC)

Mais conteúdo relacionado

Mais procurados

Nosql taxonomy with new nugget
Nosql taxonomy with new nuggetNosql taxonomy with new nugget
Nosql taxonomy with new nugget
Matt Ingenthron
 
Moving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScaleMoving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScale
mmoline
 
Cloud Computing: Amazon AWS and EC2
Cloud Computing: Amazon AWS and EC2Cloud Computing: Amazon AWS and EC2
Cloud Computing: Amazon AWS and EC2
Teamskunkworks
 

Mais procurados (18)

Nosql taxonomy with new nugget
Nosql taxonomy with new nuggetNosql taxonomy with new nugget
Nosql taxonomy with new nugget
 
Drupal meets PostgreSQL for DrupalCamp MSK 2014
Drupal meets PostgreSQL for DrupalCamp MSK 2014Drupal meets PostgreSQL for DrupalCamp MSK 2014
Drupal meets PostgreSQL for DrupalCamp MSK 2014
 
What can-be-done-around-mesos
What can-be-done-around-mesosWhat can-be-done-around-mesos
What can-be-done-around-mesos
 
Optimising for Performance
Optimising for PerformanceOptimising for Performance
Optimising for Performance
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
 
Moving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScaleMoving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScale
 
Cloud Computing: Amazon AWS and EC2
Cloud Computing: Amazon AWS and EC2Cloud Computing: Amazon AWS and EC2
Cloud Computing: Amazon AWS and EC2
 
Barcamp Macau 2014 - Introduction to AWS
Barcamp Macau 2014 - Introduction to AWSBarcamp Macau 2014 - Introduction to AWS
Barcamp Macau 2014 - Introduction to AWS
 
MongoDB SF Python
MongoDB SF PythonMongoDB SF Python
MongoDB SF Python
 
Scaling WordPress on DigitalOcean
Scaling WordPress on DigitalOceanScaling WordPress on DigitalOcean
Scaling WordPress on DigitalOcean
 
Hong Kong Drupal User Group - Sep 13th
Hong Kong Drupal User Group - Sep 13thHong Kong Drupal User Group - Sep 13th
Hong Kong Drupal User Group - Sep 13th
 
캐시 분산처리 인프라
캐시 분산처리 인프라캐시 분산처리 인프라
캐시 분산처리 인프라
 
What's up?
What's up?What's up?
What's up?
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 
Microsoft Azure Media Services
Microsoft Azure Media ServicesMicrosoft Azure Media Services
Microsoft Azure Media Services
 
WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOL
 

Semelhante a CI_CONF 2012: Scaling

Php training in bhubaneswar
Php training in bhubaneswar Php training in bhubaneswar
Php training in bhubaneswar
litbbsr
 
5 Common Mistakes You are Making on your Website
 5 Common Mistakes You are Making on your Website 5 Common Mistakes You are Making on your Website
5 Common Mistakes You are Making on your Website
Acquia
 

Semelhante a CI_CONF 2012: Scaling (20)

HDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the CloudHDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the Cloud
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
HDF Cloud Services
HDF Cloud ServicesHDF Cloud Services
HDF Cloud Services
 
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
 
Learn from my Mistakes - Building Better Solutions in SPFx
Learn from my  Mistakes - Building Better Solutions in SPFxLearn from my  Mistakes - Building Better Solutions in SPFx
Learn from my Mistakes - Building Better Solutions in SPFx
 
Optimization of modern web applications
Optimization of modern web applicationsOptimization of modern web applications
Optimization of modern web applications
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
Be faster then rabbits
Be faster then rabbitsBe faster then rabbits
Be faster then rabbits
 
Fundamentals of performance tuning PHP on IBM i
Fundamentals of performance tuning PHP on IBM i  Fundamentals of performance tuning PHP on IBM i
Fundamentals of performance tuning PHP on IBM i
 
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalability
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion
 
Caching strategies with lucee
Caching strategies with luceeCaching strategies with lucee
Caching strategies with lucee
 
Apache Content Technologies
Apache Content TechnologiesApache Content Technologies
Apache Content Technologies
 
Php training in bhubaneswar
Php training in bhubaneswar Php training in bhubaneswar
Php training in bhubaneswar
 
Php training in bhubaneswar
Php training in bhubaneswar Php training in bhubaneswar
Php training in bhubaneswar
 
Top ten-list
Top ten-listTop ten-list
Top ten-list
 
Preparing for SRE Interviews
Preparing for SRE InterviewsPreparing for SRE Interviews
Preparing for SRE Interviews
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 
5 Common Mistakes You are Making on your Website
 5 Common Mistakes You are Making on your Website 5 Common Mistakes You are Making on your Website
5 Common Mistakes You are Making on your Website
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

CI_CONF 2012: Scaling

  • 2. Who am I? • Chris Miller • Huffington Post - Senior Developer • CMS platform and API • Started in systems/network admin before code
  • 3. What is Huffington Post? • #87 most popular site in the world (Alexa) • #3 most popular news site in world (Alexa) • #19 most popular US site (Alexa) • More traffic than nytimes.com
  • 4. Our Platform: Today • Everything! No, really. • Perl: CMS core • PHP “layer” integrated on top of Perl code • MySQL data storage • MongoDB for comments storage • Hadoop for internal statistical analysis • Memcache for lightweight caching • Redis for more structured data types • Varnish for caching!
  • 5. Our Platform: Tomorrow • Re-think tools and platform from ground up • Building new API – Yes, OAuth 2.0! – Complete REST approach – Will be public! • We can’t re-write everything at once, so the API build has 4 phases: – Build “bridge” middleware to allow access to existing functionality – Refactor backend edit/admin tools – Refactor frontend to use API – Transparently, and calmly, refactor old code while maintaining API interfaces
  • 6. So what about CI? • New API is built on CodeIgniter – Using Phil’s REST library as a starting point • Thanks Phil! • Backend editorial tools are being built on CI • We love CI – But it isn’t our only framework – Different tools work better for different teams – We use what works. You should too.
  • 7. How we scale • CDN: Akamai • 80%+ hit rate • Amazon S3 for origin of static files • Basic page layout/content is generated to flat file • These contain some dynamic content, in PHP • By having the basic page as a flat file, it's less overhead to load • It also means for certain changes, we have to "regenerate" the page. Ugh.
  • 8. Varnish • HTTP caching reverse proxy (“HTTP Accelerator”) • Caching layer in front of your web server • Stores complete responses in memory • If request exists, serves from memory – Otherwise, forwards to web server, and then caches • Works nicely with Linux Kernel to delegate memory allocation and management to the OS, where it belongs
  • 9. Controlling Varnish • Set custom TTLs for content: if (beresp.http.X-HP-Cache-Control ~ "s-maxage") { set beresp.http.X-HP-Cache-Control = regsub(beresp.http.X-HP-Cache-Control, "^.*s-maxage=([0-9]+).*", "1"); // set the ttl. C{ char *ttl; ttl = VRT_GetHdr(sp, HDR_BERESP, "023X-HP-Cache-Control:"); VRT_l_beresp_ttl(sp, atoi(ttl)); }C set beresp.http.X-Cacheable = "CUSTOM: " + beresp.ttl ; } elsif (beresp.http.X-HP-Cache-Control ~ "(no-cache|private)" || beresp.http.pragma ~ "no-cache") { set beresp.ttl = 0s; set beresp.http.X-Cacheable = "NO-CACHE"; } else { set beresp.http.X-Cacheable = "DEFAULT: 30s"; set beresp.ttl = 30s; }
  • 10. Controlling Varnish • Refreshing content sub process_refresh_requests { if (req.request == "REFRESH") { set req.request = "GET"; set req.hash_always_miss = true; } } • This is invoked early in the vcl_recvvcl_recv method
  • 11. Edge Side Includes • Include cached content blocks into pages <html> <body> <esi:include src="http://example.com/my_page1.html” alt="http://example.com/my_page2.html" onerror="continue” /> </body> </html>
  • 12. Edge Side Includes • How to use ESI: – Make complicated blocks independently- accessible URIs – Create a “template” file with ESI includes to bring the page together • Why this is powerful – If multiple pages use different combinations of page components, some may already be cached – Reduces amount of times entire page must be served; Serve only components needed
  • 13. Varnish Tricks • Intelligently purge the cache when your content changes – Allows you to increase TTL without fear of caching outdated content if (req.request == "PURGE") { if (!client.ip ~ purgers) { error 405 "Method not allowed"; } return (lookup); }
  • 14. Other Scaling Tips • Hardware SSL offloading is your friend • Consider mod_php – CGI has huge overhead – CGI/SuExec has huge security advantages – FastCGI is a happy-medium for some
  • 15. Other Scaling Tips • Don’t try to do everything on one server/cluster – Splitting your application is ok – 1 cluster for frontend, 1 server/cluster for backend, etc. • Keep an open mind about technologies, platforms, and tools
  • 16. One More Thing… (sorry, I couldn’t resist)
  • 17. Guilds! • What a guild is: – Groups of people around a topic – Membership/participating is encouraged, but not required – Think of it as an internal Meetup • Join to learn new things • Join to talk about things you are interested in • Examples: PHP, Front End, Python, Ruby, Management, Platform/Architecture, Big Data, etc…
  • 18. Guilds! • Experts to solve technology-specific problems – Example: Front-end swat team to improve page load time due to slow/too much JS • Collectively give back to the community around your technology • Help others learn, and learn from others • Meet people on other teams
  • 21. Chris Miller chris.miller@huffingtonpost.com @ee99ee (P.S. – We’re hiring in NYC)