SlideShare uma empresa Scribd logo
1 de 12
ARCHITECTING
EXTREMELY LARGE
SCALE WEB
APPLICATIONS
A MUST read for every architect
ABSTRACT
An overviewof the muchneededvicissitude in
the architectural thoughttransformationfrom
monolithictomicroservicesarchitecture.
PRASHANTH B PANDURANGA
Directorof Technology
New York times auto scaled to 500,000 users, HipChathas about 1.2 billion messages/documentsstored,
Sales force deals with 1,300,000,000 daily transactionswithover24,000 databasetransactionspersecond
and over 22PB of raw SAN storage capacity, CinchCast has over 50 million page views a month, Pinterest
has over 18 million visitors with a 10X growthrate, Amazon hasover 55 million active customer accounts,
Flickr has over 4 billion queries per day, Netflix has 48 million members with over 50,000 requests per
second and the list goes on. . Thanks to HighScalability, where the above statistics are derived from, you
get a picture of the websites and the scale that I amreferring to!
I hadcovered the SAAS requirementsin my previousblog. . In this blog I shall providea bird’seye
view of the technologies used by a few large scale websites and their impact on the multi-tier web
applicationarchitecture.
Reference:HighScalability has someawesome architecture blogs, I havederived the technology stack from
multitudeof articles hostedby highscalability anda few from Netflix blogs
http://highscalability.com/blog/category/example
Technology Stack
This section is notmeant to be an extensive overview of each of the below websites and their technology
stack, butrather providesan overview of the variety of tools usedand the logical layers that form the
architecture.
NOTE: The below mentioned toolsused by thecompanies are point in time and may have been changed.
Figure 1 NewYorkTimes Technology Stack
Figure 2 HipChat Technology stack
Figure 3 Salesforce Technology stack
Figure 4 Netflix Technology stack
I haveelaborated on SAAS requirementsand general architectural requirementsin my previousblog.
https://prashanthpanduranga.wordpress.com/2015/03/25/inevitability-of-multi-tenancy-saas-in-product-
engineering/
The requirements which apply the mostto large scale web applications:
Performance: Therearelotof statisticspublishedrelatingto performanceimplicationsof web applications.
Onesuchstatistic, Users abandonwebsiteevenif thereis a 2 seconddelay duringa transaction. Largescale
web applications obviously have very high performance requirements. A very large percentage of
applications gets redesigned primarily for a better user experience. Average Page Load Time as a fraction
of Server and client time, Network time, Page views, Bounce Rate, Percentage Exit, Average Redirection
Time, Average Domain Lookup Time, Average Server Connection Time, Average Server Response Time,
Average Page DownloadTime, Average contentload time, Average sessiontime, DNSresolutiontime, TCP
connection time, Time to first byte, Full page object load time, Requests per second, error rates, Peak
responsetime, Uptime, CPU utilization, Memory Utilizationare just a few metrics tolook for
Availability:Businesscontinuity isof utmostimportance. Various availability techniques can be applied on
every layer. AlwaysOn Failover Cluster Instances, AlwaysOn Availability Groups, Database mirroring, Log
shipping, Redundancy models - Active-Active, Active-Passive, Redundancy Methods - Hot Standby, Warm
Standby, Cold Standby, and the measurement of the same expressed as mean time to failure, mean time
to repair, Availability as a measure of MTTF / (MTTF + MTTR), Eliminating single points of failure,
Accelerating fault detection, isolation and resolution, hot spares, warm spares, cold spares, clustering,
RAID, redundancy, Heart beats, watermarkingresources, check pointing, Watchdogs andmore
Monitoring and Diagnostics:
26 front end proxy serves. Double that in backend app servers [Atlassian HipChat], 15,000+ hardware
systems – [Salesforce], 100 hardware nodes in production – [CinchCast], 180 Web Engines + 240 API
Engines, 88 MySQLDBs(cc2.8xlarge) + 1 slave each, 110 RedisInstances, 200 MemcacheInstances,4 Redis
Task Manager + 80 Task Processors, ShardedSolr – [Pinterest], 1000+supporteddevices – [Netflix]
Imaginemaintainingthose, when there are innumerableservers involved, failure of systemcomponentsis
common, needless to say, to take appropriate timely actions monitoring and diagnostics plays a very
importantrole.
Scalability:Capability ofsupporting andoptimizingresource utilization on increasing workloadson various
dimensions such as memory, cores, data structures, throughout and more. Goes without doubt that the
application need to scale, in order for the application to perform well, and without automating it, the
applicationcannot stay inexpensive.
Automation:Identifyingfailuresandautomatingre-provisioningofthosecomponents/serversisextremely
important
Architecture hassignificantly emerged from a monolithicarchitecture to microservices architecture.
Figure 5 Minimalistic layers
Let’s take a look at the logical segregationof layers in current large scale web applications.
APPLICATION
DATA
PRESENTATION
SERVICE/BUSINESS LOGIC
DATA
In the following table we look at some of the technologies usedby highscale websites and compare them
to other similar tools/technologies
Note:The technologiescovered by the abovewebsites was usedas the benchmark for comparison.
Commentsare providedonly where I wanted toprovide additionaldetails.
Tools/software Usage/Descriptio
n
TOP few similar tools/software to
consider
Comments
RabbitMQ Message broker
system
Message oriented
middleware
Queuing software
ESB
ActiveMQ, Amazon SQS, HornetQ,
HiveMQ, JMS, Kafka, ZeroMQ, MSMQ,
NServiceBus, Azure Service Bus, OpenMQ,
Redis, Storm, Akka, ApacheCamel and
Spring, OFM, FuseESB, WebsphereMQ,
WindowsService Bus, BizTalk, WSO2,
Mule, Talend ESB, Gearman, JBoss,
ServiceMix, OpenESB, Apache QPID
Look out for AMQP
compliance
Someof the tools
referred aren’t
Messagebrokers
butare usedin
conjunctionto
perform the same
Other AMQP
PIKA
Shovel
Kinesis Real time data
processing
Kafka, Storm
Tornado Web Server
HTTP Server
Nginx, Apache, IIS, lighttpd, haproxy,
varnish, glassfish, Jetty, Geronimo,
Tomcat,
Someare even
usedas reverse
proxy, proxies,
PRESENTATION
SERVICE/BUSINESS LOGIC
DATA
SECURITY
ANALYTICS
MONITORING&DIAGNOSTICS
STORAGE
EVENTS
NOTIFICATION
VIRTUALIZATION
NETWORKCOMPUTE
HARDWARE LAYER
CLOUD
CLOUD ADAPTER
AUTOMATION/BATCH
INTEGRATION
PROCESSING ENGINE
CACHE
MANAGEMENT
AD-HOCLAYER
AUDITING
CONFIGURATION
DISTRIBUTED PROCESSING
INGESTION
DATA MANAGEMENT DATA RULES META DATA DATA QUALITY
CLICKSTREAM DATA
SOURCES
CHAT
SENSOR
SOCIAL
LOGS
CRM
ERP
APPLICATION
DATA
CHANNELS
EDW
METADATA MANAGEMENT MULTI TENANT PROCESSING PARALLEL COMPUTE
PARALLEL OPERATIONS COMPLEX EVENT PROCESSING GOVERNANCE
WORKFLOW
REPORTING
STREAM PROCESSING
IN-MEMORY PROCESSING
MESSAGE TRANSPORTSMESSAGE QUEUESMESSAGE BROKERS
INTEGRATION FRAMEWORKS
ENTERPRISE SERVICE BUS
INTEGRATION SUITES
OPERATING SYSTEM
Cassandra Distributed
database
management
system
Mongodb, aerospike, accumulo, azure
table storage, bigtable, couchbase,
couchdb, dynamodb, datastax,
ElasticSearch, Greenplum, Vertica, HBase,
InfiniDb, InnoDB, MariaDB, neo4j,
Netezza, TeraData, RedShift, Riak,
RavenDB, Solr, Spark, VoltDB,
With over 200 dbs,
it’s difficult to list
all: Checkoutthe
below link
https://prashanthp
anduranga.wordpre
ss.com/2013/12/23
/why-nosql-ok-but-
why-so-many/
Someof themare
dataware housing
Solutions, While
some are data
processingengines
Hana In-Memory
database
GemFire, Hekaton, Aerospike,
BigMemory, DataBlitz, EhCache,
eXtremeDB, FuelDB, HazelCast, MonetDB,
Coherence, VoltDB
Any Key value store
can be used for the
same, some of the
enterprises have
experimented using
NoSQLstore used
as a cache and
unstructureddb
solution
Linux Operating System SUSE, FreeBSD, Solaris, Debian/Ubuntu,
WindowsServer, Mac OSX, RHEL
Only Server OS
included in the list
SockJS Web Socket like
object
Web socket, socket.io, atmosphere,
SignalR, Alchemy, Fleck
Couple of them
listed are open
source
Libev Event Loop LibEvent, Asio, Nginx, epoll
Fabrik Visual
programming IDE
Also known for
Content
construction Kits
(CCK)
IDE: Visual Studio, Eclipse, NetBeans,
Aptana
CCKS:Seblod, K2, chronoform, Zoo,
Breezingforms, Cobalt, FlexiContent
Java Programming
language
C, C++, Python, C#, PHP, Javascript, Ruby,
R, Matlab, Objective-C, Visual Basic, Perl,
Swift, Scala, Shell, GO, LISP, SAS, F#,
Groovy, Lua
Someof them
listed are web
programming
languages
adiitionals:
HTML, SQL, Haskell
Twisted Event driven
network
programming
framework
Tornado, Django, Asyncio,
AWS Cloudprovider Azure, Rackspace, CenturyLink,
Salesforce, Engineyard, Google,
OpenStack, SAP, CloudBees, CumuLogic,
Eucalyptus, Gigaspaces, Mulesoft,
Parallels, Pivotal, puppetLabs, Ravello,
The list includes
infrastructure,
platform, storage
andsecurity cloud
providers
Rightscale, SoftwareAG, Xively, AT & T,
Cisco, Comcast, EMC, GoGrid, CSC, HP,
IBM smartcloud, Joyent,
Lucene Test search
Engine Library
Azure Search, Autonomy, Solr, GSA,
Attivio, DTSearch, elasticSearch, endeca,
FAST, MarkLogic, Nutch, Sphinx, Sketchy,
Scumblr
A few NOSQL
databaseshave
been used for the
same, This list does
notinclude all the
NOSQLdatabases
thatcould be used
Adobe Air Cross-platform
runtime
Cordova(Phonegap), Appcelerator, Qt,
Sencha, cocos2d-x, Xamarin, ionic, Kony,
mono, xcode
The ones listed
here are cross
platform as well as
mobile
development
platforms.
Sensu Monitoring
Framework
Zabbix, Nagios, icinga, monit, Riemann,
statsd, graphite, zenoss, collectd, munin,
cacti, new Relic, ganglia, splunk, sentry,
dynatrace, datadog, skylight, zenoss,
observium, spiceworks, solarwinds,
fiddler, wireshark, httpwatch, firebug,
soapUI, OpManager
The list includes
some of the:
Infrastructure
monitoring
Searching,
monitoringand
analysing, Network
monitoring
Scalable distributed
monitoringsystem
PagerDuty
NeoLoad
Incident
management
system
and performance
testing and
monitoring
OpsGenie, VictorOps, xmatters, pingdom,
Gomez, webpagetest, monitis, uptrends,
keynote, OpsView, Apache JMeter,
LoadRunner, WebLOAD, Appvance,
NeoLoad, LoadUI, WAPT, Loadster,
LoadImpact, Soasta, RationalPerformance
Tester, Testing Anywhere,OpenSTA,
QEngine (ManageEngine), Loadstorm,
CloudTest, Httperf, SilkPerformer,
BlazeMeter, Visual StudioTest Suite,
Also includes web
site monitoring
Cloudbased quality
testing
Performance
monitoring
Chef IT Automation Puppet, ansible, salt, docker, Jenkins,
Capistrano, saltstack
Configuration
management,
SCCM
memCache Distributed
memory object
caching
Apc, memcached, dynacache, ehcache,
xcache,
key valuebased
NOSQLdatabases
are alsoused
Razor Physical and
virtual hardware
provisioning
solution
Axemblr, Cobbler, JuJU, SaltCloud, Dell
Crowbar, Ansible, CFEngine, Chef
Perforce Version
Management and
Content
collaboration
Git, SVN, TFS, bitbucket, ClearCase,
Subversion
Pytheas ITIL assets
management
software
Remedy (BMC), Assyst(Axios),
FrontRange, EasyVista, Hornbill, HP
Service Manager, SmartCloudControl
Desk (IBM), ServiceNow
IT incident
management, IT
problem
management, IT
change
management, IT
release
governance, IT user
self-service, IT
request
management, IT
knowledge
management, IT
service support
analyticsand
reporting, IT SLA
management
Ref: Gartner
ZUUL Service that
provides dynamic
routing,
monitoring,
resiliency and
security
Nginx, lightpd, Netscaler, HAProxy,
Radware, CoyotePoint, Barracuda, Kemp,
Varnish, Avast, Norton, Kaspersky,
Mcafee, AVG, Avast, Bitdefender, F5,
PaloAlto, Cisco ASA, Cisco ACE, Foundary,
JuniperSSG, MSTMG
Can be firewall,
router, web load
balancing server,
proxy Server etc.
Feign Javahttpclient
binder
Retrofit, JAX-RS, websocket, Jersey, CXF,
Apache HC
Includestransport
libraries
Hive Querying and
managing large
datasets residing
in distributed
storage
Impala, BigSQL, HAWQ
AWS ELB Elastic Load
balancing
Nginx, HAProxy, Route53, AzureTraffic
Manager, F5
 Port-boundservers,
sticky sessions, TCP
session
reassignment,
automaticunfail,
slow start,
SynGuard, dynamic
feedback protocol,
NAT, maximum
connection, Round
Robin, Least
Connections,
Weighted Round
Robin, Weighted
Least Connections,
FastestResponse
Layer 4 andLayer 7
load balancing
CloudLoad
balancing features:
Dedicated (static)
IP address,SSL
termination
Multiple protocols,
Advancedaccess
control, Connection
logging, Advanced
algorithmic routing,
Session
persistence,
Connection
throttling, Node
management, High
availability
Contentcaching,
Persistent
connections, Gzip
compression,
Regionalized load
balancers
gZip Applicationused
for file
compression and
decompression
httpZip,deflate, 7zip, bzip2, zlib
Akamai Content delivery
network
Azure CDN, Cloudfront, Torbit, Incapsula,
Cotendo, Fastly
HTML 5
frameworks
Javascript
Frameworks
https://www.facebook.com/notes/prashanth-
panduranga/frameworks/10152107517972934
OpenStack OpenSource
Cloudcomputing
platform
OpenStack currently has the following
features:
Compute(Nova), Object Storate(Swift),
Block Storage(Cinder), Networking
(Neutron), Dashboard(Horizon), Identity
Service (Keystone), Image Service
(Glance), Telemetry (Ceilometer),
Orchestration(Heat), Database(Trove),
Bare Metal Provisioning(Ironic), Multiple
tenantcloud messaging(Zaqar), Elastic
MapReduce (Sahara)
Hadoop Distributedstorage anddistributedprocessingof very large data setson computer
clusters
Aegisthus Bulk DataPipeline outof Cassandra
Eureka Eureka is a REST (Representational State Transfer) based service that is primarily
usedin the AWS cloud for locating services for the purposeof loadbalancing and
failover of middle-tier servers
Genie Federated JobExecution Engine
Clojure Dynamicprogramminglanguagethat targets the JavaVirtual Machine
PigPen Map-Reducefor Clojure
Governator Governatoris a library of extensionsandutilities thatenhance Google Guice to
provide:classpathscanningand automaticbinding, lifecycle management,
configurationto field mapping, field validationandparallelized object warmup
Inviso Visualize Hadoopperformance
Ribbon Ribbon is a Inter ProcessCommunication(remote procedurecalls) library with built
in software loadbalancers
Hystrix Hystrix is a latency and fault tolerance library designed to isolate pointsof access to
remote systems, servicesand 3rdparty libraries, stopcascadingfailure and enable
resilience in complex distributedsystemswhere failure is inevitable
Suro Distributeddata pipeline
Aminator A toolfor creating EBS AMIs
Lipstick Pig Visualizationframework
Zeno In-Memory DataPropagationFramework
Blesk Lightweight client for pushingnotificationsto web basedapplications/sites
Turbine Turbine is a tool for aggregating streamsof Server-SentEvent(SSE) JSON dataintoa
single stream. The targeted use case is metrics streams from instancesin an SOA
being aggregated for dashboards
Priam Co-Processfor backup/recovery, TokenManagement, andCentralizedConfiguration
managementfor Cassandra
Workflowable Workflowable is a Ruby gem that allows addingflexible workflow functionality to
Ruby onRails Applications
s3mper S3mperis a library that providesan additionallayer of consistency checking on top
of Amazon'sS3 index throughuseof a consistent, secondary index
Astyanax JavaClient for Apache Cassandra
Denominator Denominatoris a portable Javalibrary for manipulatingDNSclouds. Denominator
has pluggableback-ends, includingAWS Route53, NeustarUltra, DynECT, Rackspace
CloudDNS, OpenStack Designate, and a mock for testing
GCViz Garbage Collector Visualization framework
Curator The Curator Framework is a high-level API thatgreatly simplifies usingZooKeeper. It
addsmany features that build onZooKeeperand handlesthe complexity of
managingconnectionsto the ZooKeepercluster and retryingoperations
Staash A language-agnosticaswell as storage-agnosticwebinterface for storingdata into
persistentstorage systems, themetadatalayer abstractsa lot of storage details and
the patternautomationAPIstake care of automatingcommondataaccess patterns
Edda Edda is a Service totrack changes in cloud deployments
Brutal An asyc centered chat bot framework for pythonprogrammerswrittenusing the
twisted framework
CassJMeter JMeter pluginto run cassandratests
Glisten Groovy library for building JVM applicationswith AmazonSimple Workflow (SWF)
Pig Platformfor analyzinglarge data sets
Spark Engine for big dataprocessing, with built-inmodulesfor streaming, SQL, machine
learning and graphprocessing
Karyon Framework and a library for a cloudready web service. Blueprint for the services. It
containsBootstrapping, LibrariesandLifecycle Management, RuntimeInsightsand
Diagnostics, PluggableWeb Resources, Cloud-Ready hooks
EBS Elastic Block store, persistentblock level storage volume
Curler A Gearman worker which cURLsto do work
archaius, , Library for configurationmanagementAPI
ZooKeeper ZooKeeperis a centralized service for maintainingconfigurationinformation,
naming, providingdistributedsynchronization, andprovidinggroupservices
Parallel processing - Explicit and Implicit parallelism, batch parallelism, asynchronous programming,
segregating layers, distributing workloads, Load balancing, multi- tenancy, scaling out on all layers,
sharding, partitioning, CAP preference, reads, writes, statelessness, logging and telemetry, automating,
SOA adoption, caching, throttling, distributing requests across multiple zones, effective usage of CDNs,
Auto provisioning, Autoscaling, compression, queuing, workloaddistribution, batchprocessing, designing
system with fault tolerance, redundancy, Consistency, Availability, Partition Tolerance, event processing,
web sockets, cloud computing, fog computing, Grid Computing, Client side workload distribution, In-
Memory processing, Proxies, No single points of failure. Resilience to failure, Graceful degradation,
Recoverability from failure, design for failure, Database Transactions, Client side transactions, two-phase
commit, Auto-commit, Partition Everything, DB operations ordering, Considerations for Eventual
consistency, Functional Segmentation, Application Pools, Prevention of session state, Async Everywhere,
Index, StructuredIndexes, text indexes, entity indexes, Fuzzy match indexes, pre-aggregatedindexes, pre-
calculated indexes, embedded value indexes, join indexes, link indexes, De-Normalized Indexes (all kinds)
are all importantconsiderationsfor a highly successfuland scalable website.
Restassuredif youhaveconsideredallthe abovefactorsin yourarchitecture youareonyourway to create
a scalable one. Do let me know if you have questions regardingany particular subject andI will be glad to
write up onthe same. .

Mais conteúdo relacionado

Mais procurados

VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study
VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study
VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study VMworld
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and DeploymentCisco Canada
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Amy W. Tang
 
Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Cloudera, Inc.
 
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to ContributeHBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to ContributeHBaseCon
 
Solving Performance Problems on Hadoop
Solving Performance Problems on HadoopSolving Performance Problems on Hadoop
Solving Performance Problems on HadoopTyler Mitchell
 
The Past, Present and Future of Big Data @LinkedIn
The Past, Present and Future of Big Data @LinkedInThe Past, Present and Future of Big Data @LinkedIn
The Past, Present and Future of Big Data @LinkedInSuja Viswesan
 
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Yahoo Developer Network
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBMongoDB
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiridatastack
 
How companies use NoSQL and Couchbase - NoSQL Now 2013
How companies use NoSQL and Couchbase - NoSQL Now 2013How companies use NoSQL and Couchbase - NoSQL Now 2013
How companies use NoSQL and Couchbase - NoSQL Now 2013Dipti Borkar
 
MongoDB Operations for Developers
MongoDB Operations for DevelopersMongoDB Operations for Developers
MongoDB Operations for DevelopersMongoDB
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
Securing your Big Data Environments in the Cloud
Securing your Big Data Environments in the CloudSecuring your Big Data Environments in the Cloud
Securing your Big Data Environments in the CloudDataWorks Summit
 
Maximizing performance via tuning and optimization
Maximizing performance via tuning and optimizationMaximizing performance via tuning and optimization
Maximizing performance via tuning and optimizationMariaDB plc
 
Welcome: MariaDB today and our vision for the future
Welcome: MariaDB today and our vision for the futureWelcome: MariaDB today and our vision for the future
Welcome: MariaDB today and our vision for the futureMariaDB plc
 
Hadoop and Hive at Orbitz, Hadoop World 2010
Hadoop and Hive at Orbitz, Hadoop World 2010Hadoop and Hive at Orbitz, Hadoop World 2010
Hadoop and Hive at Orbitz, Hadoop World 2010Jonathan Seidman
 
Emerging trends in data analytics
Emerging trends in data analyticsEmerging trends in data analytics
Emerging trends in data analyticsWei-Chiu Chuang
 

Mais procurados (20)

Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study
VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study
VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010
 
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to ContributeHBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to Contribute
 
Solving Performance Problems on Hadoop
Solving Performance Problems on HadoopSolving Performance Problems on Hadoop
Solving Performance Problems on Hadoop
 
The Past, Present and Future of Big Data @LinkedIn
The Past, Present and Future of Big Data @LinkedInThe Past, Present and Future of Big Data @LinkedIn
The Past, Present and Future of Big Data @LinkedIn
 
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
 
How companies use NoSQL and Couchbase - NoSQL Now 2013
How companies use NoSQL and Couchbase - NoSQL Now 2013How companies use NoSQL and Couchbase - NoSQL Now 2013
How companies use NoSQL and Couchbase - NoSQL Now 2013
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
MongoDB Operations for Developers
MongoDB Operations for DevelopersMongoDB Operations for Developers
MongoDB Operations for Developers
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Securing your Big Data Environments in the Cloud
Securing your Big Data Environments in the CloudSecuring your Big Data Environments in the Cloud
Securing your Big Data Environments in the Cloud
 
Maximizing performance via tuning and optimization
Maximizing performance via tuning and optimizationMaximizing performance via tuning and optimization
Maximizing performance via tuning and optimization
 
Welcome: MariaDB today and our vision for the future
Welcome: MariaDB today and our vision for the futureWelcome: MariaDB today and our vision for the future
Welcome: MariaDB today and our vision for the future
 
Hadoop and Hive at Orbitz, Hadoop World 2010
Hadoop and Hive at Orbitz, Hadoop World 2010Hadoop and Hive at Orbitz, Hadoop World 2010
Hadoop and Hive at Orbitz, Hadoop World 2010
 
Emerging trends in data analytics
Emerging trends in data analyticsEmerging trends in data analytics
Emerging trends in data analytics
 

Destaque (16)

Dev opsnirvana
Dev opsnirvanaDev opsnirvana
Dev opsnirvana
 
C s rwe
C s rweC s rwe
C s rwe
 
Seeory
SeeorySeeory
Seeory
 
Mcr trendz
Mcr trendzMcr trendz
Mcr trendz
 
View d print
View d printView d print
View d print
 
Augmenting IT strategy with Enterprise architecture assessment
Augmenting IT strategy with Enterprise architecture assessmentAugmenting IT strategy with Enterprise architecture assessment
Augmenting IT strategy with Enterprise architecture assessment
 
Asset anywhere
Asset anywhereAsset anywhere
Asset anywhere
 
Arch on global_hackathon
Arch on global_hackathonArch on global_hackathon
Arch on global_hackathon
 
Light suitcase
Light suitcaseLight suitcase
Light suitcase
 
Architecting extremelylarge scale web applications
Architecting extremelylarge scale web applicationsArchitecting extremelylarge scale web applications
Architecting extremelylarge scale web applications
 
Introducing techsharp
Introducing techsharpIntroducing techsharp
Introducing techsharp
 
My stylemyway
My stylemywayMy stylemyway
My stylemyway
 
Safesors
SafesorsSafesors
Safesors
 
Digital transformation
Digital transformationDigital transformation
Digital transformation
 
Air sync
Air syncAir sync
Air sync
 
Inevitability of Multi-Tenancy & SAAS in Product Engineering
Inevitability of Multi-Tenancy & SAAS in Product EngineeringInevitability of Multi-Tenancy & SAAS in Product Engineering
Inevitability of Multi-Tenancy & SAAS in Product Engineering
 

Semelhante a Architecting extremelylargescalewebapplications

StackOverflow Architectural Overview
StackOverflow Architectural OverviewStackOverflow Architectural Overview
StackOverflow Architectural OverviewFolio3 Software
 
Webinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS ApplicationsWebinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS ApplicationsTechcello
 
Ieee-no sql distributed db and cloud architecture report
Ieee-no sql distributed db and cloud architecture reportIeee-no sql distributed db and cloud architecture report
Ieee-no sql distributed db and cloud architecture reportOutsource Portfolio
 
Reactive Integrations - Caveats and bumps in the road explained
Reactive Integrations - Caveats and bumps in the road explained  Reactive Integrations - Caveats and bumps in the road explained
Reactive Integrations - Caveats and bumps in the road explained Markus Eisele
 
High scalability of an e-commerce system on the example of Magento
High scalability of an e-commerce system on the example of MagentoHigh scalability of an e-commerce system on the example of Magento
High scalability of an e-commerce system on the example of MagentoDivante
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseCecile Le Pape
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudJames Serra
 
AWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution ShowcaseAWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution ShowcaseAmazon Web Services
 
How leading financial services organisations are winning with tech
How leading financial services organisations are winning with techHow leading financial services organisations are winning with tech
How leading financial services organisations are winning with techMongoDB
 
Modern Web Development (2018)
Modern Web Development (2018)Modern Web Development (2018)
Modern Web Development (2018)Randy Connolly
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataStylight
 
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...Kalaiselvan (Selvan)
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL TechnologiesAmit Singh
 
The future of scaling forrester research - GigaSpaces Road Show 2011
The future of scaling forrester research - GigaSpaces Road Show 2011The future of scaling forrester research - GigaSpaces Road Show 2011
The future of scaling forrester research - GigaSpaces Road Show 2011Nati Shalom
 
Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)Denodo
 
Introduction to cloudify - workshop 2013
Introduction to cloudify - workshop 2013Introduction to cloudify - workshop 2013
Introduction to cloudify - workshop 2013Barak Merimovich
 
Introduction to Couchbase: Onomi
Introduction to Couchbase: OnomiIntroduction to Couchbase: Onomi
Introduction to Couchbase: OnomiOnomi
 

Semelhante a Architecting extremelylargescalewebapplications (20)

StackOverflow Architectural Overview
StackOverflow Architectural OverviewStackOverflow Architectural Overview
StackOverflow Architectural Overview
 
Webinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS ApplicationsWebinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS Applications
 
Ieee-no sql distributed db and cloud architecture report
Ieee-no sql distributed db and cloud architecture reportIeee-no sql distributed db and cloud architecture report
Ieee-no sql distributed db and cloud architecture report
 
Reactive Integrations - Caveats and bumps in the road explained
Reactive Integrations - Caveats and bumps in the road explained  Reactive Integrations - Caveats and bumps in the road explained
Reactive Integrations - Caveats and bumps in the road explained
 
High scalability of an e-commerce system on the example of Magento
High scalability of an e-commerce system on the example of MagentoHigh scalability of an e-commerce system on the example of Magento
High scalability of an e-commerce system on the example of Magento
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and Couchbase
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
 
AWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution ShowcaseAWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution Showcase
 
How leading financial services organisations are winning with tech
How leading financial services organisations are winning with techHow leading financial services organisations are winning with tech
How leading financial services organisations are winning with tech
 
The NoSQL Movement
The NoSQL MovementThe NoSQL Movement
The NoSQL Movement
 
Modern Web Development (2018)
Modern Web Development (2018)Modern Web Development (2018)
Modern Web Development (2018)
 
Re thinkdb
Re thinkdbRe thinkdb
Re thinkdb
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big Data
 
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
 
The future of scaling forrester research - GigaSpaces Road Show 2011
The future of scaling forrester research - GigaSpaces Road Show 2011The future of scaling forrester research - GigaSpaces Road Show 2011
The future of scaling forrester research - GigaSpaces Road Show 2011
 
Symphony Driver Essay
Symphony Driver EssaySymphony Driver Essay
Symphony Driver Essay
 
Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)
 
Introduction to cloudify - workshop 2013
Introduction to cloudify - workshop 2013Introduction to cloudify - workshop 2013
Introduction to cloudify - workshop 2013
 
Introduction to Couchbase: Onomi
Introduction to Couchbase: OnomiIntroduction to Couchbase: Onomi
Introduction to Couchbase: Onomi
 

Mais de Prashanth Panduranga (11)

WebApplicationArchitectureAzure.pptx
WebApplicationArchitectureAzure.pptxWebApplicationArchitectureAzure.pptx
WebApplicationArchitectureAzure.pptx
 
WebApplicationArchitectureAzure.pdf
WebApplicationArchitectureAzure.pdfWebApplicationArchitectureAzure.pdf
WebApplicationArchitectureAzure.pdf
 
Social review
Social reviewSocial review
Social review
 
Meet mi
Meet miMeet mi
Meet mi
 
Flex matics
Flex maticsFlex matics
Flex matics
 
Doc byyou
Doc byyouDoc byyou
Doc byyou
 
Being there
Being thereBeing there
Being there
 
Agri future
Agri futureAgri future
Agri future
 
Introduction to Enterprise architecture and the steps to perform an Enterpris...
Introduction to Enterprise architecture and the steps to perform an Enterpris...Introduction to Enterprise architecture and the steps to perform an Enterpris...
Introduction to Enterprise architecture and the steps to perform an Enterpris...
 
Why nosql also_why_somany
Why nosql also_why_somanyWhy nosql also_why_somany
Why nosql also_why_somany
 
Mongo learning series
Mongo learning series Mongo learning series
Mongo learning series
 

Último

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Último (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Architecting extremelylargescalewebapplications

  • 1. ARCHITECTING EXTREMELY LARGE SCALE WEB APPLICATIONS A MUST read for every architect ABSTRACT An overviewof the muchneededvicissitude in the architectural thoughttransformationfrom monolithictomicroservicesarchitecture. PRASHANTH B PANDURANGA Directorof Technology
  • 2. New York times auto scaled to 500,000 users, HipChathas about 1.2 billion messages/documentsstored, Sales force deals with 1,300,000,000 daily transactionswithover24,000 databasetransactionspersecond and over 22PB of raw SAN storage capacity, CinchCast has over 50 million page views a month, Pinterest has over 18 million visitors with a 10X growthrate, Amazon hasover 55 million active customer accounts, Flickr has over 4 billion queries per day, Netflix has 48 million members with over 50,000 requests per second and the list goes on. . Thanks to HighScalability, where the above statistics are derived from, you get a picture of the websites and the scale that I amreferring to! I hadcovered the SAAS requirementsin my previousblog. . In this blog I shall providea bird’seye view of the technologies used by a few large scale websites and their impact on the multi-tier web applicationarchitecture. Reference:HighScalability has someawesome architecture blogs, I havederived the technology stack from multitudeof articles hostedby highscalability anda few from Netflix blogs http://highscalability.com/blog/category/example Technology Stack This section is notmeant to be an extensive overview of each of the below websites and their technology stack, butrather providesan overview of the variety of tools usedand the logical layers that form the architecture. NOTE: The below mentioned toolsused by thecompanies are point in time and may have been changed. Figure 1 NewYorkTimes Technology Stack
  • 3. Figure 2 HipChat Technology stack Figure 3 Salesforce Technology stack
  • 4. Figure 4 Netflix Technology stack I haveelaborated on SAAS requirementsand general architectural requirementsin my previousblog. https://prashanthpanduranga.wordpress.com/2015/03/25/inevitability-of-multi-tenancy-saas-in-product- engineering/ The requirements which apply the mostto large scale web applications: Performance: Therearelotof statisticspublishedrelatingto performanceimplicationsof web applications. Onesuchstatistic, Users abandonwebsiteevenif thereis a 2 seconddelay duringa transaction. Largescale web applications obviously have very high performance requirements. A very large percentage of applications gets redesigned primarily for a better user experience. Average Page Load Time as a fraction of Server and client time, Network time, Page views, Bounce Rate, Percentage Exit, Average Redirection Time, Average Domain Lookup Time, Average Server Connection Time, Average Server Response Time, Average Page DownloadTime, Average contentload time, Average sessiontime, DNSresolutiontime, TCP connection time, Time to first byte, Full page object load time, Requests per second, error rates, Peak responsetime, Uptime, CPU utilization, Memory Utilizationare just a few metrics tolook for Availability:Businesscontinuity isof utmostimportance. Various availability techniques can be applied on every layer. AlwaysOn Failover Cluster Instances, AlwaysOn Availability Groups, Database mirroring, Log shipping, Redundancy models - Active-Active, Active-Passive, Redundancy Methods - Hot Standby, Warm Standby, Cold Standby, and the measurement of the same expressed as mean time to failure, mean time to repair, Availability as a measure of MTTF / (MTTF + MTTR), Eliminating single points of failure,
  • 5. Accelerating fault detection, isolation and resolution, hot spares, warm spares, cold spares, clustering, RAID, redundancy, Heart beats, watermarkingresources, check pointing, Watchdogs andmore Monitoring and Diagnostics: 26 front end proxy serves. Double that in backend app servers [Atlassian HipChat], 15,000+ hardware systems – [Salesforce], 100 hardware nodes in production – [CinchCast], 180 Web Engines + 240 API Engines, 88 MySQLDBs(cc2.8xlarge) + 1 slave each, 110 RedisInstances, 200 MemcacheInstances,4 Redis Task Manager + 80 Task Processors, ShardedSolr – [Pinterest], 1000+supporteddevices – [Netflix] Imaginemaintainingthose, when there are innumerableservers involved, failure of systemcomponentsis common, needless to say, to take appropriate timely actions monitoring and diagnostics plays a very importantrole. Scalability:Capability ofsupporting andoptimizingresource utilization on increasing workloadson various dimensions such as memory, cores, data structures, throughout and more. Goes without doubt that the application need to scale, in order for the application to perform well, and without automating it, the applicationcannot stay inexpensive. Automation:Identifyingfailuresandautomatingre-provisioningofthosecomponents/serversisextremely important Architecture hassignificantly emerged from a monolithicarchitecture to microservices architecture. Figure 5 Minimalistic layers Let’s take a look at the logical segregationof layers in current large scale web applications. APPLICATION DATA PRESENTATION SERVICE/BUSINESS LOGIC DATA
  • 6. In the following table we look at some of the technologies usedby highscale websites and compare them to other similar tools/technologies Note:The technologiescovered by the abovewebsites was usedas the benchmark for comparison. Commentsare providedonly where I wanted toprovide additionaldetails. Tools/software Usage/Descriptio n TOP few similar tools/software to consider Comments RabbitMQ Message broker system Message oriented middleware Queuing software ESB ActiveMQ, Amazon SQS, HornetQ, HiveMQ, JMS, Kafka, ZeroMQ, MSMQ, NServiceBus, Azure Service Bus, OpenMQ, Redis, Storm, Akka, ApacheCamel and Spring, OFM, FuseESB, WebsphereMQ, WindowsService Bus, BizTalk, WSO2, Mule, Talend ESB, Gearman, JBoss, ServiceMix, OpenESB, Apache QPID Look out for AMQP compliance Someof the tools referred aren’t Messagebrokers butare usedin conjunctionto perform the same Other AMQP PIKA Shovel Kinesis Real time data processing Kafka, Storm Tornado Web Server HTTP Server Nginx, Apache, IIS, lighttpd, haproxy, varnish, glassfish, Jetty, Geronimo, Tomcat, Someare even usedas reverse proxy, proxies, PRESENTATION SERVICE/BUSINESS LOGIC DATA SECURITY ANALYTICS MONITORING&DIAGNOSTICS STORAGE EVENTS NOTIFICATION VIRTUALIZATION NETWORKCOMPUTE HARDWARE LAYER CLOUD CLOUD ADAPTER AUTOMATION/BATCH INTEGRATION PROCESSING ENGINE CACHE MANAGEMENT AD-HOCLAYER AUDITING CONFIGURATION DISTRIBUTED PROCESSING INGESTION DATA MANAGEMENT DATA RULES META DATA DATA QUALITY CLICKSTREAM DATA SOURCES CHAT SENSOR SOCIAL LOGS CRM ERP APPLICATION DATA CHANNELS EDW METADATA MANAGEMENT MULTI TENANT PROCESSING PARALLEL COMPUTE PARALLEL OPERATIONS COMPLEX EVENT PROCESSING GOVERNANCE WORKFLOW REPORTING STREAM PROCESSING IN-MEMORY PROCESSING MESSAGE TRANSPORTSMESSAGE QUEUESMESSAGE BROKERS INTEGRATION FRAMEWORKS ENTERPRISE SERVICE BUS INTEGRATION SUITES OPERATING SYSTEM
  • 7. Cassandra Distributed database management system Mongodb, aerospike, accumulo, azure table storage, bigtable, couchbase, couchdb, dynamodb, datastax, ElasticSearch, Greenplum, Vertica, HBase, InfiniDb, InnoDB, MariaDB, neo4j, Netezza, TeraData, RedShift, Riak, RavenDB, Solr, Spark, VoltDB, With over 200 dbs, it’s difficult to list all: Checkoutthe below link https://prashanthp anduranga.wordpre ss.com/2013/12/23 /why-nosql-ok-but- why-so-many/ Someof themare dataware housing Solutions, While some are data processingengines Hana In-Memory database GemFire, Hekaton, Aerospike, BigMemory, DataBlitz, EhCache, eXtremeDB, FuelDB, HazelCast, MonetDB, Coherence, VoltDB Any Key value store can be used for the same, some of the enterprises have experimented using NoSQLstore used as a cache and unstructureddb solution Linux Operating System SUSE, FreeBSD, Solaris, Debian/Ubuntu, WindowsServer, Mac OSX, RHEL Only Server OS included in the list SockJS Web Socket like object Web socket, socket.io, atmosphere, SignalR, Alchemy, Fleck Couple of them listed are open source Libev Event Loop LibEvent, Asio, Nginx, epoll Fabrik Visual programming IDE Also known for Content construction Kits (CCK) IDE: Visual Studio, Eclipse, NetBeans, Aptana CCKS:Seblod, K2, chronoform, Zoo, Breezingforms, Cobalt, FlexiContent Java Programming language C, C++, Python, C#, PHP, Javascript, Ruby, R, Matlab, Objective-C, Visual Basic, Perl, Swift, Scala, Shell, GO, LISP, SAS, F#, Groovy, Lua Someof them listed are web programming languages adiitionals: HTML, SQL, Haskell Twisted Event driven network programming framework Tornado, Django, Asyncio, AWS Cloudprovider Azure, Rackspace, CenturyLink, Salesforce, Engineyard, Google, OpenStack, SAP, CloudBees, CumuLogic, Eucalyptus, Gigaspaces, Mulesoft, Parallels, Pivotal, puppetLabs, Ravello, The list includes infrastructure, platform, storage andsecurity cloud providers
  • 8. Rightscale, SoftwareAG, Xively, AT & T, Cisco, Comcast, EMC, GoGrid, CSC, HP, IBM smartcloud, Joyent, Lucene Test search Engine Library Azure Search, Autonomy, Solr, GSA, Attivio, DTSearch, elasticSearch, endeca, FAST, MarkLogic, Nutch, Sphinx, Sketchy, Scumblr A few NOSQL databaseshave been used for the same, This list does notinclude all the NOSQLdatabases thatcould be used Adobe Air Cross-platform runtime Cordova(Phonegap), Appcelerator, Qt, Sencha, cocos2d-x, Xamarin, ionic, Kony, mono, xcode The ones listed here are cross platform as well as mobile development platforms. Sensu Monitoring Framework Zabbix, Nagios, icinga, monit, Riemann, statsd, graphite, zenoss, collectd, munin, cacti, new Relic, ganglia, splunk, sentry, dynatrace, datadog, skylight, zenoss, observium, spiceworks, solarwinds, fiddler, wireshark, httpwatch, firebug, soapUI, OpManager The list includes some of the: Infrastructure monitoring Searching, monitoringand analysing, Network monitoring Scalable distributed monitoringsystem PagerDuty NeoLoad Incident management system and performance testing and monitoring OpsGenie, VictorOps, xmatters, pingdom, Gomez, webpagetest, monitis, uptrends, keynote, OpsView, Apache JMeter, LoadRunner, WebLOAD, Appvance, NeoLoad, LoadUI, WAPT, Loadster, LoadImpact, Soasta, RationalPerformance Tester, Testing Anywhere,OpenSTA, QEngine (ManageEngine), Loadstorm, CloudTest, Httperf, SilkPerformer, BlazeMeter, Visual StudioTest Suite, Also includes web site monitoring Cloudbased quality testing Performance monitoring Chef IT Automation Puppet, ansible, salt, docker, Jenkins, Capistrano, saltstack Configuration management, SCCM memCache Distributed memory object caching Apc, memcached, dynacache, ehcache, xcache, key valuebased NOSQLdatabases are alsoused Razor Physical and virtual hardware provisioning solution Axemblr, Cobbler, JuJU, SaltCloud, Dell Crowbar, Ansible, CFEngine, Chef Perforce Version Management and Content collaboration Git, SVN, TFS, bitbucket, ClearCase, Subversion
  • 9. Pytheas ITIL assets management software Remedy (BMC), Assyst(Axios), FrontRange, EasyVista, Hornbill, HP Service Manager, SmartCloudControl Desk (IBM), ServiceNow IT incident management, IT problem management, IT change management, IT release governance, IT user self-service, IT request management, IT knowledge management, IT service support analyticsand reporting, IT SLA management Ref: Gartner ZUUL Service that provides dynamic routing, monitoring, resiliency and security Nginx, lightpd, Netscaler, HAProxy, Radware, CoyotePoint, Barracuda, Kemp, Varnish, Avast, Norton, Kaspersky, Mcafee, AVG, Avast, Bitdefender, F5, PaloAlto, Cisco ASA, Cisco ACE, Foundary, JuniperSSG, MSTMG Can be firewall, router, web load balancing server, proxy Server etc. Feign Javahttpclient binder Retrofit, JAX-RS, websocket, Jersey, CXF, Apache HC Includestransport libraries Hive Querying and managing large datasets residing in distributed storage Impala, BigSQL, HAWQ AWS ELB Elastic Load balancing Nginx, HAProxy, Route53, AzureTraffic Manager, F5  Port-boundservers, sticky sessions, TCP session reassignment, automaticunfail, slow start, SynGuard, dynamic feedback protocol, NAT, maximum connection, Round Robin, Least Connections, Weighted Round Robin, Weighted
  • 10. Least Connections, FastestResponse Layer 4 andLayer 7 load balancing CloudLoad balancing features: Dedicated (static) IP address,SSL termination Multiple protocols, Advancedaccess control, Connection logging, Advanced algorithmic routing, Session persistence, Connection throttling, Node management, High availability Contentcaching, Persistent connections, Gzip compression, Regionalized load balancers gZip Applicationused for file compression and decompression httpZip,deflate, 7zip, bzip2, zlib Akamai Content delivery network Azure CDN, Cloudfront, Torbit, Incapsula, Cotendo, Fastly HTML 5 frameworks Javascript Frameworks https://www.facebook.com/notes/prashanth- panduranga/frameworks/10152107517972934 OpenStack OpenSource Cloudcomputing platform OpenStack currently has the following features: Compute(Nova), Object Storate(Swift), Block Storage(Cinder), Networking (Neutron), Dashboard(Horizon), Identity Service (Keystone), Image Service (Glance), Telemetry (Ceilometer), Orchestration(Heat), Database(Trove), Bare Metal Provisioning(Ironic), Multiple tenantcloud messaging(Zaqar), Elastic MapReduce (Sahara) Hadoop Distributedstorage anddistributedprocessingof very large data setson computer clusters Aegisthus Bulk DataPipeline outof Cassandra
  • 11. Eureka Eureka is a REST (Representational State Transfer) based service that is primarily usedin the AWS cloud for locating services for the purposeof loadbalancing and failover of middle-tier servers Genie Federated JobExecution Engine Clojure Dynamicprogramminglanguagethat targets the JavaVirtual Machine PigPen Map-Reducefor Clojure Governator Governatoris a library of extensionsandutilities thatenhance Google Guice to provide:classpathscanningand automaticbinding, lifecycle management, configurationto field mapping, field validationandparallelized object warmup Inviso Visualize Hadoopperformance Ribbon Ribbon is a Inter ProcessCommunication(remote procedurecalls) library with built in software loadbalancers Hystrix Hystrix is a latency and fault tolerance library designed to isolate pointsof access to remote systems, servicesand 3rdparty libraries, stopcascadingfailure and enable resilience in complex distributedsystemswhere failure is inevitable Suro Distributeddata pipeline Aminator A toolfor creating EBS AMIs Lipstick Pig Visualizationframework Zeno In-Memory DataPropagationFramework Blesk Lightweight client for pushingnotificationsto web basedapplications/sites Turbine Turbine is a tool for aggregating streamsof Server-SentEvent(SSE) JSON dataintoa single stream. The targeted use case is metrics streams from instancesin an SOA being aggregated for dashboards Priam Co-Processfor backup/recovery, TokenManagement, andCentralizedConfiguration managementfor Cassandra Workflowable Workflowable is a Ruby gem that allows addingflexible workflow functionality to Ruby onRails Applications s3mper S3mperis a library that providesan additionallayer of consistency checking on top of Amazon'sS3 index throughuseof a consistent, secondary index Astyanax JavaClient for Apache Cassandra Denominator Denominatoris a portable Javalibrary for manipulatingDNSclouds. Denominator has pluggableback-ends, includingAWS Route53, NeustarUltra, DynECT, Rackspace CloudDNS, OpenStack Designate, and a mock for testing GCViz Garbage Collector Visualization framework Curator The Curator Framework is a high-level API thatgreatly simplifies usingZooKeeper. It addsmany features that build onZooKeeperand handlesthe complexity of managingconnectionsto the ZooKeepercluster and retryingoperations Staash A language-agnosticaswell as storage-agnosticwebinterface for storingdata into persistentstorage systems, themetadatalayer abstractsa lot of storage details and the patternautomationAPIstake care of automatingcommondataaccess patterns Edda Edda is a Service totrack changes in cloud deployments Brutal An asyc centered chat bot framework for pythonprogrammerswrittenusing the twisted framework CassJMeter JMeter pluginto run cassandratests Glisten Groovy library for building JVM applicationswith AmazonSimple Workflow (SWF) Pig Platformfor analyzinglarge data sets Spark Engine for big dataprocessing, with built-inmodulesfor streaming, SQL, machine learning and graphprocessing
  • 12. Karyon Framework and a library for a cloudready web service. Blueprint for the services. It containsBootstrapping, LibrariesandLifecycle Management, RuntimeInsightsand Diagnostics, PluggableWeb Resources, Cloud-Ready hooks EBS Elastic Block store, persistentblock level storage volume Curler A Gearman worker which cURLsto do work archaius, , Library for configurationmanagementAPI ZooKeeper ZooKeeperis a centralized service for maintainingconfigurationinformation, naming, providingdistributedsynchronization, andprovidinggroupservices Parallel processing - Explicit and Implicit parallelism, batch parallelism, asynchronous programming, segregating layers, distributing workloads, Load balancing, multi- tenancy, scaling out on all layers, sharding, partitioning, CAP preference, reads, writes, statelessness, logging and telemetry, automating, SOA adoption, caching, throttling, distributing requests across multiple zones, effective usage of CDNs, Auto provisioning, Autoscaling, compression, queuing, workloaddistribution, batchprocessing, designing system with fault tolerance, redundancy, Consistency, Availability, Partition Tolerance, event processing, web sockets, cloud computing, fog computing, Grid Computing, Client side workload distribution, In- Memory processing, Proxies, No single points of failure. Resilience to failure, Graceful degradation, Recoverability from failure, design for failure, Database Transactions, Client side transactions, two-phase commit, Auto-commit, Partition Everything, DB operations ordering, Considerations for Eventual consistency, Functional Segmentation, Application Pools, Prevention of session state, Async Everywhere, Index, StructuredIndexes, text indexes, entity indexes, Fuzzy match indexes, pre-aggregatedindexes, pre- calculated indexes, embedded value indexes, join indexes, link indexes, De-Normalized Indexes (all kinds) are all importantconsiderationsfor a highly successfuland scalable website. Restassuredif youhaveconsideredallthe abovefactorsin yourarchitecture youareonyourway to create a scalable one. Do let me know if you have questions regardingany particular subject andI will be glad to write up onthe same. .