In this talk I will go over Wix's architecture, how we evolved our system to be highly available even at the worst case scenarios when everything can break, how we built a self-healing eventual consistency system for website data distribution and will show some of the patterns we use that helps us render 45M websites while maintaining a relatively low number of servers.
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Wix Architecture at Scale - QCon London 2014
1. Wix Architecture at Scale
Aviran Mordo
Head of Back-End Engineering @ Wix
@aviranm
linkedin.com/in/aviran
aviransplace.com
2.
3. Wix in Numbers
Over 45,000,000 users
1M new users/month
Static storage is >800TB of data
1.5TB new files/day
3 data centers + 2 clouds (Google, Amazon)
300 servers
700M HTTP requests/day
600 people work at Wix, of which ~ 200 in R&D
4. Initial Architecture
Tomcat, Hibernate, custom web framework
Built for fast development
Stateful login (Tomcat session), Ehcache, file uploads
No consideration for performance, scalability and testing
Intended for short-term use
Lighttpd
(file serving)
Wix
(Tomcat)
MySQL
DB
5. The Monolithic Giant
One monolithic server that handled everything
Dependency between features
Changes in unrelated areas of the system caused
deployment of the whole system
Failure in unrelated areas will cause system wide downtime
7. Concerns and SLA
Edit websites
Data Validation
Security /
Authentication
Data consistency
Lots of data
View sites, created
by Wix editor
High availability
Serving Media
High availability
High performance
High performance
High traffic
volume
Lots of static files
Long tail
Very high traffic
volume
Viewport optimization
Cacheable data
9. Making SOA Guidelines
Each service has its own database (if one is needed)
Only one service can write to a specific DB
There may be additional read-only services that directly
accesses the DB (for performance reasons)
Services are stateless
No DB transactions
Cache is not a building block, but an optimization
11. Editor Server
Immutable JSON pages (~2.5M / day)
Site revisions
Active – standby MySQL cross datacenters
Editor Server
MySQL
Active
Sites
MySQL
Archive
12.
13. Protect The Data
Protect against DB outage with fast recovery = replication
Protect against data poisoning/corruption = revisions /
backup
Make the data available at all times = data distribution to
multiple locations / providers
14. Saving Editor Data
Browser
Save
Page(s)
200 OK
Editor
Server
Upload
Static
Grid
Notify
Save
Page
MySQL
Active
Sites
MySQL
Archive
DC
replication
MySQL
Active
Sites
MySQL
Archive
Download Page
Notify
Archive
(Google)
Google
Cloud
Storage
Archive
(Amazon)
15. Self Healing Process
Browser
Save
Page(s)
200 OK
Editor
Server
Upload
Static
Grid
Notify
Save
Page
MySQL
Active
Sites
MySQL
Archive
DC
replication
MySQL
Active
Sites
MySQL
Archive
Download Page
Notify
Archive
(Google)
Google
Cloud
Storage
Archive
(Amazon)
16. No DB Transactions
Save each page (JSON) as an atomic operation
Page ID is a content based hash (immutable/idempotent)
Finalize transaction by sending site header (list of pages)
Can generate orphaned pages, not a problem in practice
18. Prospero – Wix Media Storage
800TB user media files
3M files uploaded daily
500M metadata records
Dynamic media processing
• Picture resize, crop and sharpen “on the fly”
• Watermark
• Audio format conversion
20. Prospero – Wix Media Manager
Tampa
Google
Cloud
x36
x36
T x32
T
Second
fallback
First
fallback
Austin
CDN
get image.jpg
If not
in
CDN
x36
x36
T x32
T
22. Public Segment Roles
Routing (resolve URLs)
www.example.com
HTML
Renderer
Dispatching (to a renderer)
HTML
SEO
Renderer
Flash
SEO
Renderer
Flash
Renderer
Sitemap
Renderer
Robots.txt
Renderer
Rendering (HTML,XML,TXT)
Public
Server
24. Publish A Site
Publish site header (a map of pages for a site)
Publish routing table
Publish site header / routes
Editor Segment
Public Segment
25. Built For Speed
Minimize out-of-service hops (2 DB, 1 RPC)
Lookup tables are cached in memory, updated every 5
minutes
Denormalized data – optimize for read by primary key
(MySQL)
Minimize business logic
26. How a Page Gets Rendered
Bootstrap HTML template that contains only data
Only JavaScript imports
JSON data (site-header + dynamic data)
No “real” HTML view
29. Why JSON?
Easy to parse in JavaScript and Java/Scala
Fairly compact text format
Highly compressible (5:1 even for small payloads)
Easy to fix rendering bugs (just deploy a new client code)
32. Serving a Site – Sunny Day
Browser
http://example.wix.com
HTTP
Request
HTML
Resources /
Media
CDN
Notify
site view
HTTP
Request
LB
Store HTML
to cache
Public
Rendere
r
Archiv
e
Statics
33. Serving a Site – DC Lost
Browser
http://example.wix.com
CDN
Statics
HTTP
Request
LB
Archiv
e
LB
Public
Public
Rendere
r
Rendere
r
Change DNS
34. Serving a Site – Public Lost
Browser
http://example.wix.com
HTTP
Request
CDN
HTML
Get
Cached HTML
Version
LB
Public
Rendere
r
Archiv
e
Statics
35. Living in the Browser
Browser
http://example.wix.com
HTTP
Request
Fallback
Editor
LB
Public
Rendere
r
HTML
JSON /
Media
CDN
Fallback
Archiv
e
Statics
36. Summary
Identify your critical path and concerns
Build redundancy in critical path (for availability)
De-normalize data (for performance)
Minimize out-of-process hops (for performance)
Take advantage of client’s CPU power
Editor – Read immediately after write – Small working setViewer optimize for readsWe fight for every ms. Page view = many resource downloading
Read-only services only if it is part of the same business functionality
Immutable data helps handle eventual consistencyMySql is a great key-value storeNot all data is equal (only 6% of websites are edited 3 months after creation)
Revision keep data safe from poisoning Pay in storage and management
Highly optimized code – every ms count
Tech vendor lock is a myth, easy to change the api (small dev effort). Invest in data distribution.Evaluation of new platform starts by putting the data.
We can change the arrows as we want
Save pages on JSONUpload to static storage
Explain what is JSON and what is HTML
UPS dies, secondary power source connected to the same UPS