Understanding the Difference between Overlay and Underlay Networks

A remarkable thing about how we approach cloud connectivity is combining a carrier-class underlay with a level of automation typically associated with an overlay network. Our CEO Dave Ward and co-founder and SVP of Product and Engineering Anna Claiborne recently addressed just this topic in a talk on API-driven underlay networking at WAN Summit. So it’s an excellent opportunity to unpack how these two approaches to networking differ, as applied to the enterprise cloud core.

Defining Underlay and Overlay Networking

Underlay networks refer to the physical network infrastructure: DWDM equipment (in the case of wide-area networks), ethernet switches and routers (from vendors like Arista, Cisco, Juniper, and Nokia), and the cable plant physical infrastructure such as fiber optic cabling that connects all these network devices into a network topology.

Underlay networks can be Layer 2 or Layer 3 networks. Layer 2 underlay networks today are typically based on Ethernet, with segmentation accomplished via VLANs. The Internet is an example of a Layer 3 underlay network, where Autonomous Systems run control planes based on interior gateway protocols (IGPs) such as OSPF and IS-IS, and BGP serves as the Internet-wide routing protocol. And Multi-Protocol Label Switched (MPLS) networks are a legacy underlay WAN technology that falls between Layer 2 and Layer 3.

By contrast, overlay networks implement network virtualization concepts. A virtualized network consists of overlay nodes (e.g., routers), where Layer 2 and Layer 3 tunneling encapsulation (VXLAN, GRE, and IPSec) serves as the transport overlay protocol sometimes referred to as OTV (Overlay Transport Virtualization).

There are two prominent examples of virtual network overlays. The first and best known are SD-WAN architectures that rely heavily on VPN functionality to replace MPLS circuits, making it less costly and easier to connect various branch offices, retail locations, and other remote sites to a WAN. The other example is cloud-native networking, where encapsulating traffic with VPN tunnels is the preferred method of connecting VPCs to enterprise locations.

Overlay Network Achilles Heel: The Internet

Overlay networks offer notable benefits, including software-driven network automation and VPN privacy between tunnel endpoints. Recently, providers of multi-cloud networking have created solutions that further abstract the per CSP networking logic so that it’s easier to manage overlay connectivity between clouds. These are all good things.

However, overlay networks can’t escape the gravitational pull of the Internet as an underlying network. VPN tunnels or not, the Internet isn’t private, is rife with security threats, and exacts a significant latency tax on traffic flows due to its shared, collective nature. Even cloud provider backbones are shared network services that function as extensions of the Internet.

Furthermore, there are workflows that just don’t belong in a VPN tunnel, such as when you’re:

  • Building a WAN between colocation data centers
  • Connecting a digital operations backbone to reach edge locations
  • Moving significant numbers of user flows to a critical enterprise cloud or SaaS application
  • Transporting significant volumes of application, transaction, or data replication traffic on a hybrid or multi-cloud basis
  • Trying to do anything latency-sensitive

Finally, a significant challenge with Internet VPNs in the cloud networking context is economics. Egress data charges from cloud providers when transporting volumetric traffic via their Internet-connected backbones is costly and unpredictable.

The Problematic Alternative: Telco Connections

If you need scalability, privacy, security, and predictable low latency, traditionally, that has meant turning to telco underlay connectivity. But we encounter numerous problems on this path. First of all, getting telco connections isn’t easy. It can take weeks to get a quote on WAN services. Then, it can take months to provision the service. Further, you typically have to commit to your peak anticipated bandwidth level for three years, even if you’re nowhere close to that level now, which is highly wasteful. Fortunately, there’s a better underlay option.

Automated Underlay Turns WAN into Cloud

At PacketFabric, we approached underlay networking with SDN automation from the ground up. But let’s first unpack what we mean by our underlay network. The PacketFabric network is a telco-grade 50T+ private optical network built across hundreds of redundant PoPs and leased dark fiber paths. To quote our co-founder Anna Claiborne from the WAN Summit talk:

“This is OUR network. We run the DWM gear; we leased the dark fiber. We do the hard stuff, so everyone else doesn’t have to. And on top of that dark fiber network, we have our own ethernet fabric that provides layer two and three services. So the great thing about operating our own network from the ground up is that we can provide an exceptionally resilient network service.”

We built our automation stack from the ground up. In telco terms, we developed an OSS and BSS that completely automates the different layers of the network and offers services with native multitenancy. PacketFabric writes the code and provides the API for every function. That API-driven operation affords customers a new operating paradigm that eliminates traditional networks’ provisioning and negotiation delays while providing consumption terms aligned with cloud consumption.

Essentially, the PacketFabric network platform operates as a cloud. This means that customers can consume private optical network services like SaaS.

As Dave Ward noted in the talk, “The entire service, to get to a cloud or multiple clouds, or build your own network or build the connectivity you need, you can get this down to a minute, or single minutes. The network is code.”
But don’t just take our word for it. Instead, check out this LinkedIn blog written by the CIO of a Fintech company. In it, he explains how PacketFabric rescued their WAN from a third-party provider VPLS service issue by spinning up long-haul data center interconnections in minutes.

A Stark Performance Difference

During the talk, Anna gave a demo of a multi-cloud Kafka pipeline to show the performance impacts of using a connectivity cloud-like PacketFabric to build your cloud core versus an Internet-based overlay network. The demo was built with Python and used PacketFabric APIs to create connectivity on-demand with our CloudRouter via API calls, driving the point that the network has to behave like code, just like cloud compute. If you watch the demo replay, you’ll be astounded at how big a difference in performance there is.

Get Automated Underlay Now

If you’re intrigued with how you can get this level of connectivity, check out our services, look into our locations, and get a flavor of our pricing. Then request a demo or just fire up and get connected today.