This document provides a summary of a presentation on implementing a virtual data center architecture in the cloud. It discusses using a hub-spoke model with a central "bastion" virtual network acting as a hub, connected to various line-of-business (LOB) virtual networks acting as spokes. The bastion VNET provides shared services and security functions for the LOB VNETs. It describes implementing firewalls, network virtual appliances, and other security controls in the bastion VNET to inspect and filter traffic between LOB VNETs and between the LOB VNETs and on-premises networks.
How to Troubleshoot Apps for the Modern Connected Worker
Virtual Data Center VDC - Azure Cloud Reference Architecture CRA
1. Presented by:
Ammar Hasayen | MS MVP
CISSP | Cybersecurity
http://ahasayen.com
CLOUD REFERENCE
ARCHITECTURE
PART 4 – Virtual Data Center
Date: 9Th May 2020
Available on SlideShare &
YouTube
|
@ammarhasaye
n
2. About Me: http://ahasayen.com
Blog: http://blog.ahasayen.com
Social Media: @ammarhasayen
Microsoft MVP | Pluralsight Author | Blogger
Book Author
AMMAR HASAYEN
CISSP | CISM | AWS Architect | Azure Security Engineer | M365 Security Engineer
3. CHECK THE VIDEO DESCRIPTION BELOW FOR
LINKS TO PREVIOUS TALK
Quick Overview of Previous Talk
4. Cloud reference architecture (CRA) helps organizations
address the need for detailed, modular and current architecture
guidance for building solutions in the cloud
CRA
11. Shared Services Model
Shared
Services
LOB 1
RBAC
LOB 2
RBAC
LOB 3
RBAC
Hybrid Connectivity
RBAC
RBAC allows segregation of duties
between centralized and
specialized team
Common components are
minimized (reduced cost and
complexity)
DevOps are enabled where
possible (workload subscriptions)
Centralized IT is enabled at the
security and infrastructure
components
Central security/infra teams
manage the edges (internet and to
on-premises)
13. Extend Trust Cloud
On-premises
Data Center
Extend The Trust
SDN
VNETS Subnets UDRs
NSG NVAs DMZ
Shard Services
VNET Service Endpoints
DDoS
Virtual Data Center
15. What Is VNET (Virtual Private Network)?
Isolated boundary with no default
ingress endpoints
Segment with subnet and security
groups
Deploy Workloads
Control traffic flow with User
Defined Routes (UDR)
Connectivity
- Expose public endpoint
- S2S VPN or P2S
- Express route
Bring Your Own Network
VNET
Subnet
10.1.0.0/16
10.1.1.0/24
Virtual Data Center
NSG
NSG
NSG
UDR
UDR
UDR
16. Why You Might Need More Than One VNET?
Virtual Data Center
SubscriptionHR App
SubscriptionMarketing App
SubscriptionIT Services
VNETs can’t span multiple
subscriptions
VNET is a separate management
control
Gateway transit for virtual network
peering
VNET
HR App
Subscription
VNET
Marketing App
Subscription
VNET
IT Services
Subscription
VNET
Peering
17. On-premises Connectivity
Virtual Data Center
Separate hybrid connectivity for
each VNET and LOB
Increases management complexity
Separate firewall in each LOB
VNET
Does not scale well with more LOB
in the cloud
Overcome subscription limit to the
number of connections
VNET
Gateway
VNET
Gateway
VNET
Gateway
US OfficeEurope Office
VNET
Peering
VNET
Peering
VNET Peering
19. Cloud Security Alliance
It is a non-profit organization with a mission to “promote the use of best
practices for providing security assurance within cloud computing.
The cloud security alliance recommend implementing a preferred and flexible
architecture for hybrid cloud connectivity using a “bastion virtual network”.
In this architecture, it is possible to connect multiple cloud networks to your
on-premises datacenter via one hybrid cloud connection.
You build a dedicated virtual network for the hybrid connection and then
peers any other networks through the designated bastion network.
You can also deploy firewall rule sets to protect traffic flowing in and out of
the hybrid connection.
20. The Bastion Virtual Network
On-Premises
Virtual Data Center
LOB 3 VNETLOB 2 VNETLOB 1 VNET
VNET
Peering
Bastion VNET
Hybrid Connectivity
LOB 4 VNET
Firewall & Inspection
Shared Services
21. Connectivity
to On-Premises
Virtual Data Center
LOB 3 VNETLOB 2 VNETLOB 1 VNET
VNET
Peering
Traffic originating from LOB VNETs
goes to the central firewall for
inspection
Then through the centralized
hybrid connectivity on the bastion
VNET
Traffic originating from on-
premises to LOB VNET goes to
the bastion VNET
Traffic is inspected by the central
firewall in the bastion VNET
On-Premises
Bastion VNET
Hybrid Connectivity
22. LOB-to-LOB
Connectivity
Virtual Data Center
LOB 3 VNET
LOB 2 VNET
LOB 1 VNET
Mesh connectivity is one option
Doesn’t scale well and can reach
subscription limits
Separate firewall in each LOB
VNET
LOB 4 VNET
VNET
Peerin
g
23. LOB-to-LOB
Connectivity
Virtual Data Center
LOB 3 VNETLOB 2 VNETLOB 1 VNET
Hub and Spoke model
Each spoke maintains one
connection to the hub VNET
Central firewall at hub inspect
spoke-to-spoke traffic
Scales well with more LOB in the
cloud
LOB 4 VNET
VNET
Peerin
g
Hub VNET
24. LOB-to-LOB
Connectivity
Virtual Data Center
LOB 3 VNETLOB 2 VNETLOB 1 VNET
Traffic originating from one LOB
VNET goes to the hub VNET
Traffic gets inspected by the
central firewall at the hub VNET
Traffic is then routed through
VNET peering to the destination
LOB VNET
LOB 4 VNET
VNET
Peering
Hub VNET
25. Shared Services Virtual Data Center
LOB 1 VNET
Duplicate common services
between LOB applications
Increases management complexity
Increase cost
Bastion VNET
DC CA Files
LOB 2 VNET
DC CA Files
26. Shared Services Virtual Data Center
LOB 1 VNET
Workloads in one LOB VNET use
the VNET peering connection to
access domain controller in the
hub VNET
Workloads in other LOB VNET use
the peering connection to access
file repository in the Hub VNET
No need to implement common
services in LOB VNETs
New LOB deployed can use all
common services already
available in the hub VNET
VNET
Peering
Hub VNET
DC CA Files
LOB 2 VNET
LOB 3 VNET
27. Jumpboxes Virtual Data Center
LOB 1 VNET
Hub VNET is a good candidate to
place jumpboxes
Remote administration for
workloads in LOB VNETs are only
allowed from the jumpboxes in the
hub VNET
Network Security Groups (NSG)
allows incoming remote
administration traffic from hub
jumpboxes
Central firewall is managed
through the jumpbox
Possibly implement jumpboxes in
each VNET
Hub VNET
Jumpbox
LOB 2 VNET
NSGNSG
28. Hub-Spoke Model Virtual Data Center
LOB N VNETLOB 2 VNETLOB 1 VNET
VNET
Peering
Spokes are LOB VNETs where
applications are deployed
Hub is the bastion VNET where
spokes connect to
Hub serves as:
- Shared hybrid connectivity
- Shared services
- Policy enforcement, monitoring
and traffic inspection
Hub VNET is also known as:
- Bastion VNET
- Shard services VNET
On-Premises
Hub VNET
Hybrid Connectivity
Shared Services
HUB VNET
Shared Services
Spoke VNET
29. Loosely Coupled Virtual Data Center
LOB 1 VNET
VNET
Peering
Adding a new LOB app does not
affect other LOB apps
Removing LOB app does not
affect other LOB apps
DevOps team working on LOB 1
app does not depend on other
DevOps teams serving other LOB
apps
On-Premises
Hub VNET
Hybrid Connectivity
LOB 2 VNET LOB 3 VNET
Shared Services
30. Segregation of
Duties
Virtual Data Center
LOB 2 VNETLOB 1 VNET
Separation of concern &
segregation of duties between:
- DevOps
- SecOps and NetOps
DevOps can’t shut down the
firewall to make their life easier
- putting business in risk
Shared Services
Shared Services
DevOps
SecOps NetOps
31. Auditing & Logs Virtual Data Center
LOB 2 VNETLOB 1 VNET
Dedicated subscription to host
monitoring, audit and log data
- Log analytics workspaces
- Storage accounts
- Playbooks
Separate logs from workloads
- Different ACLs on log data
- Separation of duties
Removing LOB VNET/subscription
doesn’t affect the logs retention
policy.
Shared Services VNET
VNET
Auditing & Logging Subscription
Storage Accounts
Log Analytics Workspaces
Playbooks
Send Log
SecOps
32. Hub VNET Deep Dive
The need for “Financial Governance”
33. Hub-Spoke Model Virtual Data Center
LOB N VNETLOB 2 VNETLOB 1 VNET
VNET
Peering
Spokes are LOB VNETs where
applications are deployed
Hub is the bastion VNET where
spokes connect to
Hub serves as:
- Shared hybrid connectivity
- Shared services
- Policy enforcement, monitoring
and traffic inspection
Hub VNET is also known as:
- Bastion VNET
- Shard services VNET
On-Premises
Hub VNET
Hybrid Connectivity
Shared Services
HUB VNET
Shared Services
Spoke VNET
34. On-Premises
Express Route Gateway
Hub VNET
Gateway Subnet
S2S VPN Gateway
Express Route
Failover VPN
P2S VPN
Jumpboxes
Availability Set
Bastion Subnet
Domain Controllers
Availability Set
DC Subnet
Private DMZ IN
NSG
NVA
N
I
C
N
I
C
Private DMZ Out
NSG
NVA
N
I
C
N
I
C
To Spokes
From Spokes
NSGNSG
36. On-Premises
Hub VNET
Gateway Subnet Private DMZ IN
NSG
NVA
N
I
C
N
I
C
Private DMZ Out
NSG
NVA
N
I
C
N
I
C
To Spokes
From On-premises To LOB
37. On-Premises
Hub VNET
Gateway Subnet Private DMZ IN
NSG
NVA
N
I
C
N
I
C
Private DMZ Out
NSG
NVA
N
I
C
N
I
C
From LOB To On-premises
From Spokes
38. On-Premises
Express Route Gateway
Hub VNET
Gateway Subnet
S2S VPN Gateway
Express Route
Failover VPN
P2S VPN
Private DMZ IN
NSG
NVA
N
I
C
N
I
C
Private DMZ Out
NSG
NVA
N
I
C
N
I
C
To Spokes
From Spokes
The
Internet
Public DMZ IN
NSG
NVA
N
I
C
N
I
C
Public DMZ Out
NSG
NVA
N
I
C
N
I
C
To Spokes
From Spokes
P
I
P
Internet Traffic
40. US Virtual Data Center
HubVNET
Spoke 1
Spoke 2
Spoke n
Europe Virtual Data Center
HubVNET
Spoke 1
Spoke 2
Spoke n
US Office Europe Office
Multiple Regions
41. PLEASE SHARE YOUR FEEDBACK ON ONE OF MY SOCIAL CHANNELS
@ammarhasayen
Let Me Know Your Feedback
42. YOU CAN ACCESS THE SLIDES FROM SlideShare @ammarhasayen
Thank You For Your Time
43. YOU CAN WATCH THIS PRESENTATION ON YOUTUBE
http://YouTube.com/ammarhasayen
Watch It Online
48. COPYRIGHT STATEMENT
I want to help you share knowledge and creativity, to build a more
equitable, accessible, and innovative world, by unlocking the
potential of the internet to drive new era of development, growth
and productivity.
This is why I provide you with my copyright license, to make it easy
for you to share and use creative work on simple terms and
conditions. This license lets you remix, tweak, and build upon my
work non-commercially, as long as you credit me and license your
new creations under the identical terms.
https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
Attribution-NonCommercial-ShareAlike
Simply put, the cloud reference architecture (CRA) helps organizations address the need for detailed, modular and current architecture guidance for building solutions in the cloud.
The ISO/IEC 17789 Cloud Computing Reference Architecture defines four different views for the CRA:
User View
Functional View
Implementation View
Deployment View.
We will be focusing on the Deployment View of the Cloud Reference Architecture for now.
To accomplish this, we need to define the components of the cloud reference architecture that we will use to build secure, compliant and flexible framework that developers can build application on top with agility and speed of delivery in mind.
At the core of building an enterprise scaffold for cloud migration is the Enterprise Structure Layer which act as the foundation on which all other layers are built. Here you define a hierarchy that maps to your organization departments and cost centers to govern spending and get visibility of cost across departments, line of business applications or business units. On top, you define a Management Hierarchy that gives you even more flexibility when assigning permissions and applying policies to enforce your governance in the cloud.
With that carefully defined, you start adopting key best practices and patterns that maps to your organization’s maturity level. You can think of these as the Deployment Essentials which includes establishing a proper naming convention, deploying with automation and using Infrastructure as Code instead of using the web interface to deploy resources which can cause a snow ball effect of changes that in the future becomes hard to manage, track or even audit. The idea here is to have a consistent way of deploying resources over and over again. Not only it gives you that speed of delivery we all want to have, but also a piece of mind that what you verified as a compliant environment in code, is the blueprint used to deploy resources across your subscriptions.
Now it is time to start building the foundation infrastructure and this is the Core Networking layer. At this layer, governance can be achieved using different technologies that helps you isolate and deploy security controls to monitor and inspect traffic across your cloud infrastructure. One of the best recommendations here is to use a hub and spoke topology and adopt the shared service model where common resources are consumed from different LOB applications which has many benefits that we will discuss in great details later.
In this layer, you decide how to extend your on-premises data center to the cloud. You also define how to design and implement isolation using virtual networks and user defined routes .This is also the time where you deploy Network Virtual Appliances (NVAs) and firewalls to inspect data flow inside your cloud infrastructure.
Another key feature of the cloud is the Software Defined Networks (SDNs) that gives you the opportunity to do micro-segmentation by implementing Network Security Groups and Application Security Groups to better control traffic even within subnets, not only at the edge of the network which is an evolution of how we think about isolation and protection in such elastic cloud computing environment.
After you are done with the core networking layer, and just before deploying your resources, you should consider how are you going to enforce Resource Governance. This is important because the goal of the cloud reference architecture is to give developers more control and freedom to deploy workloads quickly and meet their deadlines, while adhering to corporate security and governance needs. One way to achieve this balance is by applying resource tags, implementing cost management controls, and also by translating your organizational governance rules and policies into Azure policies that governs the usage of cloud resources.
Once all this foundation work is finished, you can start planning how to deploy your line of business applications (LOB applications). Most likely you need to define different application lifecycle environments like (Production, Dev, and QA).
Here you can also establish a shared services workspace to hosts shared infrastructure resources for your line of business applications to consume. If one of your business applications requires a connectivity to on-premises resources, it can use the VPN gateway for example deployed in the shared services workspace instead of implementing a gateway for each application’s workspace. The shared services workspace is a key element when defining your CRA as it hosts shared services like domain controllers, DNS services, jumpbox devices and security controls like firewalls.
But your job is far from finished, as security is a never-ending process, and this is where the Security Layer comes to the picture. Here you define proper identity and access management model using Azure RBAC. Security practices like patching, encryption and secure DevOps are key areas in this layer. Furthermore, to gain the visibility and control you need in such rapidly changed environment, you need to think of a security as a service model which natively integrate with the cloud platform and services, so here you can use Azure security center to assess your environment for vulnerabilities but also as enabler to your incident response in the cloud, as you need to detect and remediate security incidents.
You can also implement Just-in Time Virtual Machine Access to lock down management ports on your virtual machines. If you are highly regulated environment, you can also look at VNET Service Endpoints to protect access to PaaS Services like Azure Storage so that accessing these services does not pass through the public internet.
With all this in mind, you need to consider Business Continuity, high availability and backup, and here I want to remind you of the shared reasonability model of the cloud. You are responsible of many things which might include planning how to do backups, how to design for high availability and even for disaster recovery
And finally, How to think of monitoring and auditing in the cloud. Is there is a performance bottleneck that you should address right away, do you require that changes to your cloud environment is audited, so where are you going to keep the logs, are you going to integrate that with your on-premises SIEM solution, or use a cloud logging mechanism, and if so, does that solution retain the logs for the duration you need?
It is not something related purely to networking, it is nothing you sell in Azure, but it is a philosophy.
I saw many patterns over the year of customers deploying workloads in Azure without forward thinking, or without adopting a framework (a.k.a reference architecture). Usually they are at the initial stage of adopting the cloud or more of (let’s try how this thing work first). From my experience and when talking to a lot of customer, I saw usually this pattern happening more frequently.
“Let’s try to do this in the cloud”, someone shouted thinking that it is time to evolve and start exploring the cloud and see how it feels. They might have some Azure credit in their EA agreement, or they consult with their CSP partner to get a subscription in Azure and willing to spend some dollars in this. Perhaps they have couple of people certified in Azure and they want to get their hands dirty working in Azure.
Usually these attempts happen without proper sizing or cost optimization in mind and the goal is to put some workloads there. Sometimes it is a new small noncritical project and they want to give the cloud a chance. “We created a subscription in Azure” they shout. Of course the networking and security guys get excited as they want to be part of this new world of cloud computing or perhaps, they want to preserve their jobs, who knows.
Everyone is a global admin of course from the first day, or else things will break (they thought). The security team wear the cloud hat and start deploying a virtual machine with their favorite firewall engine (Palo Alto I assume) and of course we need two of them for high availability. Well done!
Networking team spend the whole weekend trying to open a site-to-site VPN between that subscription (VNET) and their on-premises network, and when that happen, they celebrate big time. Can’t blame them as it is a big thing when you try it for the first time, we all celebrate after all after doing that, the happiness of being able to PING a machine in Azure from your laptop at the office!
All eyes are now on the infrastructure team, those guys who spent a life-time deploying big virtualization clusters on-premises are now tasked to create couple of Azure virtual machines to host couple of applications. Without full understanding of the different Azure virtual machines types or any of the cost optimization best practices (auto-shutdown, Azure hybrid benefit, or reserved instances), they just pick a VM SKU that looks right at that moment, and mission accomplished.
Now it is time for the application team to deploy their applications on those newly created virtual machines, assuming that the disks are optimized for the right IOPS and throughput, while the billing clock is counting. After all, they get their application running in Microsoft Azure and it is a victory day for everyone.
Weeks later, the head of IT is walking around the floor and he gets a call. “We have a new application that we need to deploy next week” someone said. With full confident he said, “let’s create a new Azure subscription and repeat the success story”. So a new subscription is born, and everyone gets busy deploying firewalls, VPN connectivity, and couple of VMs. Oh, wait’ everyone is a subscription admin or else things will break, or worse, everyone is a global admin.
You can see where this is going right? They might acquire a new company, a new application need to be deployed, and of course, a new subscription and the whole story gets repeated. Perhaps this time, the security team where not involved, and they end up deploying a subscription with production workloads without any security element (firewall).
With time, managing all those subscriptions, maintaining all those firewalls, monitoring the VPN tunnel health, and tracking changes become a challenge. What started to be an exploring to the Azure lands become islands of workloads deployed here and there. Governance is lost and security is a nightmare.
But what if there is a better way to do things? What if we paused for a minute and adopt a reference architecture that can work for today’ and tomorrow’s need. A reference architecture that helps that company deploy different workloads in Azure or any cloud with governance and security in mind from day one. An architecture that works for each organization regardless of their size or needs.
I hope you are still reading and excited about what’s coming next. What if we introduced a new subscription that acts as a security and governance guardian. A subscription that helps solve all those problems by introducing a new abstraction layer in the middle. In this new subscription (to be called later a Shared Services subscription), we are going to move all those firewalls from all other subscriptions and deploy only one set of firewall devices into this new subscription. Moreover, instead of having a VPN from each subscription to your on-premises infrastructure, we will just initiate one VPN tunnel from that shared services subscription to on-premises. Now from each of the other subscription, we will create a VNET-Peering in Azure only to the shared services model. A new Hub and Spoke model starts to evolve to organize everything in the Azure land.
This is what we will call from now on the (Shared Services Model). We deploy our security, management and connectivity infrastructure in one HUB subscription, and offer these components as a service to our line of business (LOB) applications running on their perspective subscriptions. If we think about it, this model has a lot of advantages over the previous model.
First, we established what is called in the security world, a segregation of duties. This means, network and security teams manage the shared services subscription and they are the only people allowed to modify firewall settings or alter connectivity components (hybrid connectivity), while application developers or DevOps get full access on their perspective LOB subscriptions. DevOps team can have subscription owner right on their own subscriptions to innovate and do crazy stuff on their own space, but they can’t alter (and compromise) firewall and networking stuff as they are hosted in their own subscription (shared services subscription.
Second, we now eliminate the need to have connectivity and security components (firewall) in each of the LOB subscriptions, which minimize the cost and the management overhead. Instead, we have well-defined security policy that is to be maintained in the shared services level.
DevOps are now more empowered to do whatever they want (ideally, Azure security policies need to be defined by security team to prevent developers from accidently creating public Ips in their own subscription and accidently exposing workloads directly to the internet). And security and networking team need only to manage things at the HUB level.
The idea here that I am trying to deliver is by having a good reference architecture in place, IT and security team can extend their trust to the Azure land with a proven and well adopting hub and spoke architecture and by carefully considering the unique nature of the cloud. Things like the software defined networking in the cloud as networking is now defined by a code not by wires. Identity and access management if designed right, enables proper segregation of duties between different stakeholders. Security becomes a shared responsibility in the cloud where the nature of risks are different, but the role of IT security is the same as we start to leverage more of security as a service, threat intelligence and learn how each Azure component should be securely configured. Compliance and monitoring are hugely impacted in the cloud and without a proper mindset, things might fall a part.
One of the most underestimated elements when considering the cloud, and one of the most famous arguments I head from customers is the miss-understanding of the SDN component of any cloud platform. In the cloud, everything is done in code. Even if you are touching a fancy web management portal, web calls are being made to serve every request. A full networking stack can be defined in a JSON template and unleashed to create a full data center network. RIP for the broadcast and multicast as they don’t exist in the cloud. No more VLANs and switches. Routing tables are not defined inside firewalls and switches (to some extend) but are printed on VNETs and subnets. L4 firewalls are now coded withing a subnet boundaries or within a group of machines with the same tag (micro-segmentation or Application Groups in Azure).
Even the way you interact with cloud services is now different as most of time these services are exposed over the internet (think of a piece of code trying to access a storage account and don’t be surprised if you figured out this is happening over the internet and not inside your lovely Azure network). Of course some technologies are there (like VNET service endpoints) but this requires special configuration.
It is not something related purely to networking, it is nothing you sell in Azure, but it is a philosophy.
I am not going to teach you here how Azure works, but I want to give you an example of how the whole hub and spoke model works. While VLANs are the hero of on-premises world, in Azure things are virtual, so it is not a surprise that we have a construct in Azure called a virtual network or VNET. A VNET is an isolation boundary in Azure with no default ingress endpoint. We can’t think of a VNET as the same we think about a subnet in the on-premises world. A VNET for example can have one or more subnets and you define the address space of a VNET in advance where subnets later on consume portion of that VNET level address space.
It is an isolation boundary in the sense that workloads in different VNETs by default can’t talk to each others, while workloads within subnets in the same VNET can talk to each others and even have an out of the box name resolution. Routing between subnets in the same VNET is taken care of by Azure so machines in different subnets within a VNET can reach each other. However, you can control or overwrite this default behavior by configuring User Defined Routes (URDs). Now this is powerful because now we can enforce the traffic existing from those subnets to go to a hub subnet where we maintain a firewall so traffic between subnets gets inspected. This is also where enable our virtual machines in Azure to route traffic to our on-premises infrastructure.
You might ask then, if I have one big VNET with multiple subnets that I can deploy all my workloads in, why do I need multiple VNETs? Well, a single VNET is hosted inside a subscription like any and every Azure resource. In fact, every resource in Azure is a child of a subscription object and this is how Microsoft report the cost for resources per subscription. A single VNET can’t span multiple subscriptions, so if you are in a situation where you can want to create more than one subscription (there are a lot of reasons why you want to do that), then you need to create a VNET for each subscription.
For example, if you want to deploy the HR app and the marketing app in Azure, each on its own subscription, then you need to create a VNET for each subscription to host your virtual machines. You can’t create one VNET and stretch it across the two subscriptions. Going back to the Hub and Spoke model and the idea of having a shared services HUB in its own subscription, you end up with three VNETs, one for each subscription. By default, VNETs have no ingress endpoint as they are an isolation boundaries. You can’t expect a VNET to receive a traffic from the outside world or from another nearby VNET. The way to make different VNETs talk to each others is by defining a VNET Peering which simulate a VPN tunnel in the on-premises world. So in the hub and spoke model we talked about, the HR App VNET will have a VNET peering with the shared services subscription and the marketing app will have a VNET peering with the shared services VNET (the hub). From the other side, since peering is a one-way relationship, the shared services VNET will also have a VNET peering relationship with both HR and Marketing VNETs.
It is also worth mention that the traffic between resources in the peered virtual networks is completely private and stays on the Microsoft Backbone and will not go through the public internet, which means it is a secure way of connecting workloads in different VNETs (and different subscriptions) even if the other VNET is in another Azure region (although there is a cost for such traffic),
Now if we deployed a VPN gateway on the shared services VNET and established a VPN tunnel from there to our on-premises network, we can enable VMs and resources in both the HR and the Marketing VNETs to use that VPN gateway (in the hub) to reach the on-premises world. This is the whole idea of having a hub and a spoke model, that we only need one VPN gateway in the HUB that serves the hybrid connectivity needs for other spoke VNETs and eliminate the need for deploying VPN gateways and hybrid connectivity in each spoke VNET, which both reduce the cost and manageability.
The mechanism for the VPN gateway in the Hub to offer this service to the spoke VNET is called Gateway Transit. Gateway Transit is a VNET Peering property that enables one virtual network to use the VPN gateway in the peered virtual network for cross-premises connectivity. This also works if one of the VNETs are deployed in different Azure region.
Now this opens the door for a lot of design opportunities and with that capabilities we start to shape the architecture that works for our current and future needs.
Now without such forward thinking, and if you have three VNETs, each with its own VPN gateway and with two on-premises locations (Europe office and US office), you end up having three VPN tunnels per on-premises locations to each of the cloud VPN gateways. Each VNET will have a separate firewall to inspect traffic going from and to each VNET from you on-premises land. Since VNETs by default can’t talk to each others, you will end up creating a mesh of VNET peering between the three VNETs (two peering relationship per VNET, 6 in total), and if you have more VNETs, then the number of VNET peerings you need to manage will increase exponentially (in fact there is a limit of the VNET peering relationships you can have in a single subscriptions.
You can see the miss we have here if each VNET in each subscription maintains its own firewall and hybrid connectivity to both on-premises locations. Security is also hard to manage as you can reach the Azure land from many entry points and your security team need to keep an eye on each of those entry points.There is obvious demand to think of a better way to govern how traffic and security are built in the cloud.
If you don’t know already, the cloud security alliance has a solution for us
The cloud security alliance is a non-profit organization with a mission to “promote the use of best practices for providing security assurance within cloud computing.
They solve this problem as they recommend implementing a preferred and flexible architecture for hybrid cloud connectivity using a “bastion virtual network”.
In this architecture, it is possible to connect multiple cloud networks to your on-premises datacenter via one hybrid cloud connection.
You build a dedicated virtual network for the hybrid connection and then peers any other networks through the designated bastion network.
You can also deploy firewall rule sets to protect traffic flowing in and out of the hybrid connection.
Now why this is important? The Cloud Security Alliance warns that any on-premises threat can be used to propagate to the Azure land and a compromised on-premises can be used to scan your whole cloud networks. Therefore, it is important to govern that hybrid connectivity as might become the weakest link.
In this architecture, there is a need to minimize the number of tunnels between on-premises and the cloud, and those connections should terminate in a bastion VNET (shared services VNET) where security controls are in place to inspect traffic passing between the on-premises land and Azure land.
Let’s say that one of the application servers in LOB 1 VNET wants to access a server on-premises. Traffic goes to the bastion VNET (shared services), thanks to the VNET peering, then to the VPN gateway in the bastion VNET, to the internet inside the VPN tunnel, down to the VPN gateway on-premises and finally to that server. Traffic is inspected at the central firewall deployed in the bastion VNET
The hUb and spoke architecture reduces complexity also when it comes to connecting your VNETs inside Azure. If you have four VNETs and you want them to talk to each others, a VNET peering is required between each set of VNETs. The other problem here is this requires a lot of management to maintain those VNET peerings, and this doesn’t scale well when the number of VNETs you have increase. There is also a limit on the number of VNET peering you can have in a subscription.
Moreover, security becomes a major issue. How can you regulate and inspect traffic going between VNETs? Of course you can Network Security Groups, but this only gives you L4 traffic inspection and managing those network security groups become difficult. Moreover, if your devops team are managing the LOB subscription, they can easily change the settings of those network security groups configurations.
The Hub and Spoke architecture solves all those problems. Each LOB VNET maintains a VNET peering only to the HUB VNET. We then deploy a centralized firewall servers in the Hub VNET. This architecture scales well as the number of VNETs increase as we only need one set of VNET peering for each LOB VNET.
Any connectivity between two VNETs should go to the HUB VNET (shared services) where a centralized firewall nodes are used to inspect and regulate traffic. Only the security team can change the configuration of the central firewalls, leaving the DevOps team with nothing but to request access from security. This model scales well as the number of VNETs increase as you only need one set of peering between LOB VNETs and the Hub VNET.
Since now we have a HUB VNET and all LOB VNETs connect to that Hub VNET (a.k.s bastion VNET)(a.k.a shared services VNET). This introduces another opportunity of optimization. We can move common infrastructure services across all LOB VNETs and host them in the HUB VNET. Traditionally, you might have a domain controller, CA serve and file server in each LOB VNET. If we have two LOB VNETs, this means we are deploying six servers, three in each LOB VNET.
Instead, we can deploy only one set of those servers in the HUB VNET and offer these common services to all LOB applications. We reduced the number of servers from six to three. If a server in one of the LOB VNETs wants to access a domain controller, it uses the VNET peering to the HUB server to access the domain controller in the HUB VNET. This also makes onboarding new LOB easy since common infrastructure services are already deployed and offered as a service by the HUB VNET.
Jumpboxes are devices or machines we use to connect from to other workloads in the cloud. Instead of allowing RDP or SSH access from everywhere to your valuable assets in the LOB VNETs, we only configure our firewalls to allow these connections from the jumpbox server. A perfect place to place the jumpbox is to host it in the HUB VNET as it already has a peering connection to all other VNETs. In some cases, people might deploy a jumpbox per VNET but it’s up to you.
Now let’s connect all the dots together. In the cloud, the first thing we do is to create that HUB VNET (shared services VNET). In that VNET, we place our VPN gateway, centralized firewall, jumpboxes and shared services infrastructure (domain controllers, file servers, CA server,…).
Connectivity between Azure land and on-premises land is terminated at the HUB VNET so our centralized firewall in that VNET can inspect traffic passing through that hybrid connectivity. This enforces our governance and security rules and is the foundation of all other security controls.
For each LOB application, we create a dedicated VNET (spoke) and connect that VNET only to the HUB VNET. Traffic passing between VNETs must pass through the centralized firewall deployed in the HUB VNET. Any LOB VNET can access shared services infrastructure (domain controller,..) by using the VNET peer connection to the HUB.
Only the security and network engineers can manage the HUB VNET (subscription) and DevOps team are not allowed to manage that shared services subscription. On the other hand, DevOps teams are given freedom to manage their LOB VNETs (subscription). This creates a perfect balance between agility and speed of delivery that the cloud promises, and security and governance from the other hand.
With such design, we have a loosely coupled architecture. Think about it. Each LOB is deployed in its own VNET (and/or subscription) and shared services, firewall, and hybrid connectivity are all deployed in a separate HUB VNET (and/or subscription). If we no longer need to have a LOB application anymore in the cloud, we can easily delete that subscription/VNET without affecting anything else in the cloud. There is no obvious dependencies between workloads deployed in LOB1 and LOB2.
Segregation of duties is easy to implement in this architecture. SecOps and NetOps mange the HUB subscription where hybrid connectivity and central firewalls are deployed. DevOps don’t have access to that subscription or change the firewall settings to make their life easier. They need to comply to the central security policy.
From the other hand, since each LOB application deployed in its separate VNET (and/or subscription), DevOps team can be given a lot of control on that subscription to create resources and deploy faster. DevOps or developers working on one LOB might not be the same DevOps or developers working on another LOB application. With this architecture, we can give each set of people access to their own LOB subscription which enforces the principle of segregation of duties.
Another key and important security design pattern I want to share with you here is re-thinking on how to store, access and retain logs, monitoring data, playbooks and other security/management related resources.
Most cloud resources that you deploy in Azure have a setting to configure auditing or monitoring, and usually you need to store this data somewhere ( a storage account or a log analytics workspace). Let’s say you have a separate subscription for your LOB called the HR LOB subscription. You have a VNET there with all your virtual machines and other Azure resources. When you deploy resources inside the HR LOB subscription, you want to store the audit logs somewhere (a storage account), so you create a storage account in the HR LOB subscription, and everything works fine.
This will work just fine, but there are things I want to share with you here. First, audit logs and security logs are classified as critical data that should be secured and not altered. The integrity of such information should be maintained, and if your developers' team are owner of the HR LOB subscription, then they can do something bad and delete that audit data as they have access to that storage account as it is deployed in the same subscription.
Instead, you want to preserve the integrity of such logs by storing them in a separate isolated place where developers can’t change. They might have read access to the logs to troubleshoot problems, but they should not be able to change or delete that data. One option is to create a separate subscription (Auditing and Logging Subscription) with only the security team has full access, and everyone else read access.
When you then configure audit logging or any type of logging on your applications, you create a storage account or log analytics workspace in the Auditing and Logging Subscription and configure your application or Azure resource to use that.
Another though I have is what if you stored logs and auditing data for the HR LOB applications in the same HR LOB subscription, and then for any reason you don’t need that HR application anymore and you want to delete that subscription? Deleting the subscriptions means losing all the audit and log data. Sometimes, for compliance reasons, you need to maintain the logs for a long period of time. Therefore, by storing the logs in a separate subscription solves the problem. You can delete the HR subscription while maintaining the log data for a long period.
First, let us talk about the first hierarchy, the Enterprise hierarchy. When you start planning your enterprise hierarchy, think of cost and billing. Of course, there is a cost when you deploy and consume resources and services in the cloud, it is not free. It is usually based on consumption, the more you consume, the more you pay.