This document discusses integrating OpenStack into Sina's existing infrastructure. It describes Sina's business, infrastructure, and challenges faced in integrating OpenStack. Key integration challenges discussed include network deployment, security considerations, load balancing, and evaluating Swift for object storage. The document also outlines Sina's contributions to OpenStack around billing and monitoring integration.
2. Agenda
Background
● Who We Are
● Infrastructure & Platform
● Challenges
Integration Challenges
● Network Deployment
● Security Consideration
● Load Balancer
● Swift Evaluation
Our Contributions
● Billing
● Monitoring
3. Who Are We
Sina.com
• Largest infotainment web portal in China
• Provides various on-line services, like news, Finance,
video, email, blog hosting, etc.
• Operates first PaaS cloud computing platform
Sina Weibo
• twitter-like microblog service
• over 300m users
• huge influence on China's society
We are building a reliable, scalable and secure
infrastructure and platform to support our business.
4. Infrastructure & Platform
Physical Servers
Traditional Operation
Virtualization Platform(IaaS)
●VM Management System(VMMS) → Sina Web
Service(SWS)
●VMMS is private solution developed in-house
●SWS is based on OpenStack
Application Platform(PaaS)
●Virtual Host → Sina App Engine(SAE)
●SAE provides both Public and Private Service.
5. Sina App Engine
• No. 1 Public PaaS Platform in
China launched in Nov 2009
• PHP, Python, Java and Ruby
Support
• Numbers
160,000+ developers
200,000+ apps on SAE
800 million page views per day
20+ Services
• SAE Cloud Storage Service is replaced by Swift
• Deploy SAE on OpenStack
6. Challenges
SAE meets the majority of business needs, but does not cover
all, especially for web games
Customers require full stack of cloud computing
We Choose OpenStack as our IaaS solution
8. OpenStack Deployment
Rabbit
MySQL
dashboard
schedule
nova-api
nova-compute nova-compute
nova-network nova-network
keystone
glance
Sina SSO
Swift
9. Nova Network
Networking is the biggest challenges for IaaS
Network Topology:
• VLAN
• FlatDHCP
• FlatDHCP & Multihost
10. Network Topology --- VLAN
Capability:
• Accessibility of VMs within one tenant
• Isolation of VMs from different tenants
• VM is able to access public network
• VM can be accessible from public network
• Isolation between virtual network and
internal network
Drawback:
• Pre-allocate network for future projects
• Traffic bottleneck in the NAT gateway
12
11. Network Topology(Flat)
Capability:
• Accessibility of all VMs in the fixed IP range
• VM is able to access public network
• VM can be accessible from public network
• Full isolation between virtual network and
internal network
Drawback:
Tenant isolation lessens
Traffic bottleneck in the NAT gateway
13
12. Network Topology(Flat &
Multihost)
Capability:
• Accessibility of all VMs in the fixed IP range
• VM is able to access public network
• VM can be accessible from public network
Bonus:
• Totally distributed architecture avoid
single-point failure.
• Multiple gateway eliminates NAT bottleneck
• High throughout between OS regions
Drawback:
• Tenant isolation lessens
• Need security facility(SWS-filter) to protect
intranet
If security problems were solved, this would be our best choice!
14
13. Security in OpenStack
Security Group --- Layer 3 Filter Static filters --- Layer 2 Filter
Role-based firewall MAC, IP, and ARP spoofing protection
One security group is a Role Not configurable
Ingress filtering Defined in /etc/libvirt/nwfilter/*.xml
Target is the instance Implemented by ebtables
Source can be CIDR or another group ebtables -t nat --list
Implemented by iptables
See details: iptables -t filter -n -L
Whitelist mechanism(ACCEPT rules)
15
14. Security Enhancement
SWS Filter
Prevent Intranet Penetration
• Intranet is the internal network outside of
OpenStack
Egress filtering
• Target is internal network
• Source is instances in OpenStack
Implementation
• Whitelist mechanism(ACCEPT rules)
• On the top of nova-filter-top Forward
Chain
Rational
• SWS filter is managed by cloud manager
• Only explicit authorized packets can reach Internal network C
• Packet should be controlled within Compute Node
16
16. Load Balancer
Goals
Load Balance
• Dispatch request DNS Acceleration Design
• Support multiple routing algorithm
• Health check
Smart DNS
Acceleration
• Reality: narrow bandwidth between ISPs
• Building fiber channels from ISPs to pivot Public Network
• Given the same endpoint within user’s ISP
Telecom Unicom Mobile Others ISP
IPv4 Shortage
• Reality: dozens of public IPs support
hundreds of VMs High speed fiber channel
• IPv4 has been exhausted
• IPv6 is not realistic yet in China Pivot
18
18. Load Balancer
Layer 4 Load Balancer
Consideration:
1. dispatch request by TCP port
2. lvs + haproxy
20
19. Swift Evaluation
Extremely Durable and Highly Available
Superior Scalability
Linear Growth of Performance
Symmetric Architecture
No Single-failure
Simple & Reliable
21
20. Swift Evaluation
• 1 Zone = 1 Physical Server with 12x2T disk
GET abc.png • Write/Read applies quorum protocol
PUT abc.png
Load Balancer
Zone1 Zone2 Zone3 Zone4 Zone5
Proxy Server Proxy Server Proxy Server Proxy Server Proxy Server
Object Server Object Server Object Server Object Server Object Server
Container Server Container Server Container Server Container Server Container Server
Account Server Account Server Account Server Account Server Account Server
22
21. Swift Evaluation
Swift packages
Proxy Server
Account Server
Container Server
Object Server Physical Deployment
Storage Nodes
OS installation
sda sdb sdc sdd sdk
raid 1 ……
disk1 disk2 disk3 disk4 disk5 disk12
23
22. Swift Evaluation
Performance issue
CPU utilization rate up to 100% even without request
Testing environment: Audit:
Nodes: 5 x Dell R510 swift-account-auditor : 1.5m
CPU: Intel® Xeon® E5360 swift-account-replicator: 9.5m
Memory: 12GB
Replica: 3 swift-container-auditor: 8.4m
swift-container-replicator: 9.3m
No. of Objects: 150,000,000 swift-container-updater: 19.0m
No. of Accounts: 120,000
No. of Containers: 160,000 swift-object-updater: 0.1 s
swift-object-replicator: 10.5 hours
swift-object-auditor: 48.3 hours
Result:
Periodic scanning all partitions, calculating checksum and synchronization
24