Egress IP decision tree updated
I had another one of those weird problems which made me revisit the egress IP decision tree again.
System design patterns, architectural approaches, and infrastructure design
View All TagsI had another one of those weird problems which made me revisit the egress IP decision tree again.
One of the downsides of private previews is that they are under NDA so you can't really talk about them. However, I can now talk about Azure Private Link Direct Connect because it's in public preview now. It solves one of the problems that has been bugging me for a while with Private Link Services (PLS) which is that you have to use a load balancer or an application gateway in front of the service.
I have been working on a comprehensive approach to bringing network automation and documentation into a development style workflow. Rather than replacing the traditional ITSM approach to change management it moves infrastructure towards a CI/CD approach to releases with automation and baked in documentation.
Adam Stuart has a rather excellent rundown of the various ways you can approach SD-WAN connectivity into Azure cloud, providing comprehensive technical guidance for Azure-based deployments. Much of the same applies in AWS although I have often said that AWS networking is more complex and akin to something dreamt up by a stoned developer who couldn't even spell BGP. One of the legacy options included at the end of Adam's article is the cloud edge topology where you deploy physical hardware into a carrier neutral facility (CNF) like Equinix and use that as an interconnect between your SD-WAN and an ExpressRoute or Direct Connect circuit. This got me thinking about the uncertainty many organisations face when deciding how their overall cloud connectivity should evolve.
This article explores the journey from simple single-site connectivity to sophisticated multi-cloud SD-WAN architectures, examining the trade-offs, and implications of each approach. We'll walk through real-world topologies that organisations I have worked with commonly implement, from basic VPN connections to cloud-native SD-WAN NVA hubs, helping you understand which approach might be right for your organisation's scale and maturity level.
In a recent blog post I wrote: "As network engineers we are used to the declarative model of configuration management and so this fits nicely into that mindset - you declare what you want and Terraform will make it so." But declaring what you want is only half the battle. The real challenge lies in how you structure that declaration to handle the messy reality of business requirements whilst maintaining the automation benefits that drew us to declarative tools in the first place.
I happened upon the diagram below within the pages on default outbound internet access and it seemed a little counterintuitive. The decision flow seems to suggest that a VM will use the egress IP of a NAT gateway preferably over an assigned PIP.
In my previous post, I shared some basic latency tests across Azure networks. The results were pretty predictable: the closer things are physically, the faster they communicate. Not exactly groundbreaking.
But when I expanded my testing to include longer distances and different connection methods, I stumbled onto something genuinely surprising: PrivateLink connections can actually be faster than direct VNET peering - sometimes significantly so.
When I set out to explore network latency in Azure, I had a simple goal: to understand how physical distance affects performance. After all, we've all heard that farther apart means slower connections. But I wanted specifics - exactly how much slower? And how consistent is that performance? I also wanted to see how long lived TCP connections performed across the Azure network.
I'm sharing what I've learned from my first round of tests, setting a baseline that we can build on later.
The Case for Application-Level Controls
I've noticed that an organisation's approach to securing outbound internet traffic often reflects its security maturity more than its technical requirements. System-to-system communication, such as API calls to cloud services, presents fundamentally different challenges compared to user browsing. Understanding these differences is crucial for implementing effective security controls without unnecessary complexity or risk.