Scalability -

ScalabilityErik SchultinkInternational Week of Tech Innovation – 21 Apr, 2010

Tuenti.com Started 2007 1:6 pages, 1:10 minutes Based in Madrid ~130 employees, 60 engineers

INTrO What is a scalable system?

Scalability is throughput, not response time

What is a scalable system? response time requests/second

The Problem: Concurrency 25k pageviews/second at peak

What is a scalable system? response time requests/second code / architecture machines Variables:

What is a scalable system? response time x machines 2x machines requests/second

Technologies MySQL simple RDBMS InnoDB Memcache Lighttpd PHP

The Solution: Partition Work must be structured such that each resource can complete it independently Overhead to divide workload

Data architecture Look at queries you perform. Divide data such that each query can be answered by querying no more than 1 partition.

Comments on a profile Comments (user_id, author_id, comment) Post a comment on a user’s profile Get list of comments on a user’s profile Delete a comment from a user’s profile Give up for now: Comments written by a user

Comments on a profile Partition by user Costs: Determining partition of a user constant Consistency check on access that author still exists linear on number of comments to display

The Solution: Partition Constant overhead

Alternative Solution Partition by user, duplicate by author Comments(user_id, author_id, comment) AuthoredComments(author_id, user_id, comment_id)

Alternative Solution Comments(user_id, author_id, comment) AuthoredComments(author_id, user_id, comment_id) Costs: double writes extra storage delete by author still very expensive

Traditional Systems Architecture www.tuenti.com Load Balancer Web server farm Web server farm Web server farm

Traditional Systems Architecture www.tuenti.com 12.45.34.179 12.45.34.178 Load Balancer Load Balancer Web server farm Web server farm Web server farm Web server farm

AJAX What is AJAX? “Asynchronous JavaScript and XML” Paradigm for client-server interaction Change state on client, without loading a complete HTML page

Traditional HTML Browsing User clicks link Browser sends request Server receives, parses request, generates response Browser receives response and begins rendering Dependent objects (images, js, css) load and render Page appears

AJAX Browsing User clicks link Browser sends request Server receives, parses request, generates response Browser receives response and begins rendering Dependent objects (images, js, css) load and render Page appears

How does Tuenti use AJAX? Only pageloads are login and home page Loader pulls in all JS/CSS Afterwards stay within one HTML page, rotating canvas area content

Balancing Load Top-level requests to www.tuenti.com Each request tells client which farm it should be using, based on a mapping Mapping can be changed to balance load, perform maintenance, etc

Client-side Routing www.tuenti.com wwwb3.tuenti.com wwwb2.tuenti.com wwwb1.tuenti.com wwwb4.tuenti.com Load Balancer Load Balancer Load Balancer Load Balancer Web server farm Web server farm Web server farm Web server farm Linearly scalable …

Client-side Routing www.tuenti.com wwwb3.tuenti.com wwwb2.tuenti.com wwwb1.tuenti.com wwwb4.tuenti.com Load Balancer Load Balancer Load Balancer Load Balancer Web server farm Web server farm Web server farm Web server farm Linearly scalable … except for top level

Client-side Routing www.tuenti.com wwwb3.tuenti.com wwwb2.tuenti.com wwwb1.tuenti.com wwwb4.tuenti.com Load Balancer Load Balancer Load Balancer Load Balancer Web server farm Web server farm Web server farm Web server farm lots of content creation = lots of dynamic data

Client-side Routing www.tuenti.com wwwb3.tuenti.com wwwb2.tuenti.com wwwb1.tuenti.com wwwb4.tuenti.com Load Balancer Load Balancer Load Balancer Load Balancer Web server farm Web server farm Web server farm Web server farm Cache Farm lots of dynamic data = lots of cache = internal network traffic

Client-side Routing www.tuenti.com wwwb3.tuenti.com wwwb2.tuenti.com wwwb1.tuenti.com wwwb4.tuenti.com Load Balancer Load Balancer Load Balancer Load Balancer Web server farm Web server farm Web server farm Web server farm Cache Farm Cache Farm Cache Farm Cache Farm Partition cache Route requests to a farm near cache needed to respond

Image Serving Tuenti serves ~2.5 billion images/day At peak, this is >6 Gbps and >70k hits/sec We use CDNs

What is a CDN? Content Delivery Network

What is a CDN? Examples: Akamai, Limelight also dozens more, including Amazon Big distributed, object cache Pay per use either per request, per TB transfer, or per peak Mbps

What is a CDN? Advantages: Outsource dev and infrastructure Geographically distributed Economies of scale Disadvantages: High cost Less control and transparency Commitments

What affects image load time? Client internet connection Response time of CDN CDN cache hit rate

Monitor Performance from Client Closer to performance experienced by end-user Only way to get view of network issues faced by users (ie last mile)

How to fix slow ISP? Choose better transit provider Set-up peering (or get CDN too) Traffic management

Quality of End-User Experience vs. Cost

We use multiple CDNs, and shift content based on price/performance.

Pre-fetch Content Exploit predictable user behavior Ex: clicking to next photo in an album Simple solution – load next image hidden Client browser will cache it (next response < 100 ms) Increase tolerance for slow response time

Pre-fetch Content More complex solution Pre-fetch next canvas (full html), render in background – rotate in on Next Even more complex Instantiate HTML template w/ data on client Pre-fetch data X photos in advance, render Y templates in advance with this data

Pre-fetch Content Problems: Rendering still takes time Increases browser load Need to set cache headers correctly

Image delivery Small images: High request, low volume Most cost-effective to cache in memory Large images: High volume, low requests, greater tolerance for latency

Monitor Performance from Client cold servers online

More jobs.tuenti.com dev.tuenti.com

Scalability -

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (6)

Similar to Scalability -

Similar to Scalability - (20)

Recently uploaded

Recently uploaded (20)

Scalability -

Editor's Notes