2. Introduction
Microsoft Azure (formally Windows Azure) is a cloud computing service created
by Microsoft for building, testing, deploying, and managing applications and
services through a global network of Microsoft-managed data centers. It
provides software as a service (SAAS), platform as a service and infrastructure as
a service and supports many different programming languages, tools and
frameworks, including both Microsoft-specific and third-party software and
systems.
4. Compute
•Virtual machines, infrastructure as a service
(IaaS) allows users to launch general-purpose
Microsoft Windows and Linux virtual
machines, as well as preconfigured machine
images for popular software packages.
•App services, platform as a service (PaaS)
environment letting developers easily publish
and manage Web sites.
•WebJobs, applications that can be deployed to
a Web App to implement background
processing.
5. Mobile services
Mobile Engagement collects real-time
analytics that highlight users’ behavior. It
also provides push notifications to mobile
devices.
HockeyApp can be used to develop,
distribute, and beta-test mobile apps.
6. Storage services
•Storage Services provides REST and SDK APIs for storing and accessing data on the cloud.
•Table Service lets programs store structured text in partitioned collections of entities that are
accessed by partition key and primary key. It's a NoSQL non-relational database.
•Blob Service allows programs to store unstructured text and binary data as blobs that can be
accessed by a HTTP(S) path. Blob service also provides security mechanisms to control access to
data.
•Queue Service lets programs communicate asynchronously by message using queues.
•File Service allows storing and access of data on the cloud using the REST APIs or the SMB
protocol.
7. Data management
•Azure Search provides text search and a subset of OData's structured filters using REST or SDK
APIs.
•DocumentDB is a NoSQL database service that implements a subset of the SQL SELECT
statement on JSON documents.
•Redis Cache is a managed implementation of Redis.
•StorSimple manages storage tasks between on-premises devices and cloud storage.[10]
•SQL Database, formerly known as SQL Azure Database, works to create, scale and extend
applications into the cloud using Microsoft SQL Server technology. It also integrates with Active
Directory and Microsoft System Center and Hadoop.
•SQL Data Warehouse is a data warehousing service designed to handle computational and data
intensive queries on datasets exceeding 1TB.
10. Design
Microsoft Azure uses a specialized operating system, called Microsoft Azure, to
run its "fabric layer”: a cluster hosted at Microsoft's data centers that manages
computing and storage resources of the computers and provisions the resources
(or a subset of them) to applications running on top of Microsoft Azure.
Microsoft Azure has been described as a "cloud layer" on top of a number of
Windows Server systems, which use Windows Server 2008 and a customized
version of Hyper-V, known as the Microsoft Azure Hypervisor to provide
virtualization of services.
12. Data Center Fabric
Facebook’s network infrastructure
needs to constantly scale and
evolve, rapidly adapting to the
application needs. The amount of
traffic from Facebook to Internet –
is called “machine to user” traffic –
is large and ever increasing, as more
people connect and a new products
and services are created.
13. Network technology
For most traffic, the fabric makes heavy use of equal-cost multi-path (ECMP)
routing, with flow-based hashing. There are a very large number of diverse
concurrent flows in a Facebook data center, and statistically almost ideal load
distribution across all fabric links. To prevent occasional “elephant flows” from
taking over and degrading an end-to-end path, the network multi-speed – with
40G links between all switches is made, while connecting the servers on 10G
ports on the TORs.
14. Gradual scalability
To achieve the seamless growth capability, the whole network as an end-to-end
non-oversubscribed environment has been designed and planned.
This level allows to achieve the same forwarding capacity building-wide as what
we previously had intra-cluster.
16. Automation
To automate the fabric, a “top down” – holistic network logic first is followed,
then individual devices and components second – abstracting from individual
platform specifics and operating with large numbers of similar components at
once.
The tools are capable of dealing with different fabric topologies and form
factors, creating a modular solution that can adapt to different-size data centers.
17. Transparent transition
To make the transition to fabric seamless and allow for backward compatibility,
the logical concept of the “cluster” is preserved but now it as a collection of
pods.
From the networking point of view, a cluster has become just a virtual “named
area” on the fabric, and physically the pods that form a cluster can be located
anywhere on the data center floor.
19. Hardware
The original hardware (circa 1998) that was used by Google when it was located
at Stanford University included:
•Sun Microsystems Ultra II with dual 200 MHz processors, and 256 MB of RAM.
This was the main machine for the original Backrub system.
•2 × 300 MHz dual Pentium II servers donated by Intel, they included 512 MB of
RAM and 10 × 9 GB hard drives between the two. It was on these that the main
search ran.
•F50 IBM RS/6000 donated by IBM, included 4 processors, 512 MB of memory
and 8 × 9 GB hard disk drives.
20. Network topology
Google has numerous data centers
scattered around the world. The largest
known centers are located in The Dalles,
Oregon; Atlanta, Georgia; Reston, Virginia;
Lenoir, North Carolina; and Moncks Corner,
South Carolina. In Europe, the largest
known centers are in Eemshaven and
Groningen in the Netherlands and Mons,
Belgium.Google's Oceania Data Center is
claimed to be located in Sydney, Australia.
21. Software
•Google Web Server (GWS) – custom Linux-based Web server that Google uses for its online
services.
•Storage systems:
•Google File System and its successor, Colossus
•BigTable – structured storage built upon GFS/Colossus
•Spanner – planet-scale structured storage system, next generation of BigTable stack
•Google F1 – a distributed, quasi-SQL DBMS based on Spanner, substituting a custom version of
MySQL.
22. •Chubby lock service
•MapReduce and Sawzall programming language
•Indexing/search systems:
•TeraGoogle – Google's large search index (launched in early 2006), designed by Anna Patterson
of Cuil fame.
•Caffeine (Percolator) – continuous indexing system (launched in 2010).
•Hummingbird – major search index update, including complex search and voice search.
•Borg declarative process scheduling software