Ethernet and IP Networking 101 (Heavily Illustrated)

As a software engineer, I need to deal with networking every now and then - be it configuring a SOHO network, setting up container networking, or troubleshooting connectivity between servers in a data center. The domain is pretty broad, and the terminology can get quite confusing quickly. This article is my layman's attempt to sort the basic things out with the minimum words and maximum drawings. The primary focus will be on the Data link layer (OSI L2) of wired networks where the Ethernet is the king nowadays. But I'll slightly touch upon its neighboring layers too.

What is LAN?

LAN (Local Area Network) - [broadly] a computer network that interconnects computers within a limited area such as a residence, school, office building, or data center. A LAN is not limited to a single IP subnetwork. Much like any WAN, a LAN can consist of multiple IP networks communicating via routers. The main determinant of a LAN is the locality (i.e. proximity) of the participants, not the L3 topology.

Network link - a physical and logical network component used to interconnect [any kind of] nodes in the network. All the nodes of a single network link use the same link-layer protocol. Examples: a bunch of computers connected to a network switch (Ethernet); a bunch of smartphones connected to a Wi-Fi access point (non-Ethernet).

What is Network Segment?

Network segment - [broadly] a portion of a computer network. The actual definition of a segment is technology-specific (see below).

What is L1 Segment?

L1 segment (aka physical segment, aka Ethernet segment) - a network segment formed by an electrical (or optical) connection between networked devices using a shared medium. Nodes on a single L1 segment have a common physical layer.

In the early days of the Ethernet, a bunch of computers connected to a shared coaxial cable was forming a physical segment (so-called bus topology). A coaxial cable served as a shared medium between multiple nodes. Everything sent by one of the nodes was seen by all other nodes of the segment. Thus, the nodes were forming a single broadcast domain (this is 👌). Since multiple nodes could be transmitting frames simultaneously over a single cable, collisions were likely to occur. Hence, an L1 segment was forming a single collision domain (this is 👎).

Ethernet as it started, 100 000 years ago.

As an evolution of Ethernet technology, twisted-pair cables connected to a common repeater hub replaced the shared coaxial cable (so-called star topology). When a node on one of the hub's ports was transmitting frames, they were retransmitted from all the other ports of the hub. The retransmission of frames was as-is, i.e. no modification or filtration of frames was happening (hubs were pretty dumb devices). All the nodes connected to the hub still were forming a single L1 segment (hence, a single broadcast domain 👌, hence a single collision domain 👎).

Evolution of Ethernet, 500 A.D.

Both coaxial and hub-based approaches are obsolete now.

In the modern days, the star topology is prevailing. However, hubs have been replaced by more advanced network switch devices (aka bridges). An L1 segment de facto was reduced to a single point-to-point link between an end-node and a switch (or a switch and another switch). Since there are only two nodes on a physical link the potential collision domain became very small. In reality, most of the modern wiring is full-duplex, so collisions cannot simply occur at all 🎉 Curious, what happened to the broadcast domain? Then keep reading!

Ethernet via network switch, present day.

Disclaimer: In this article, the terms switch and bridge are used interchangeably. However, modern networking hardware is slightly more complex. So, whenever you see the word "bridge" here read it as "multi-port bridge". And every time you see a "switch", assume a "Layer 2 switch" only. These two things are more or less the same. Check out Bridge vs Switch: What I Learned From a Data Center Tour for more details.

Another example of the contemporary L1 segment is a point-to-point connection between two end-nodes via a patch or crossover cable.

What is Collision Domain?

Collision domain - a network segment connected by a shared medium or through hubs where simultaneous data transmissions collide with one another. Hence, the bigger a collision domain the worse. Nowadays, collision domains are common in wireless (i.e. non-Ethernet) networks, while back in the day they were common in Ethernet networks (see What is L1 Segment).

In the Ethernet world, network switches (aka bridges) form borders of collision domains.

Three collision domains separated by a bridge (rather dated setup).

What is L2 Segment?

L2 segment - multiple L1 segments interconnected using a shared switch (aka bridge) or (somewhat recursively) multiple L2 segments merged into a bigger L2 segment by an upper-layer switch* where nodes can communicate with each other using their L2 addresses (MAC) or by broadcasting frames.

L2 segment examples.

* Other implementations of L2 segments are possible, see VLAN and VXLAN sections.

Layer 2 Ethernet frame - super simple format.

That's where things get interesting.

If L1 segments are about the physical connectivity of nodes, L2 segments are rather about logical connectivity. The 1:1 and 1:All addressing provided by the Data link layer is vital for higher-layer protocols (ARP, IP, etc) implementations. See labs in the following sections.

Node sends frame using destination MAC address.

Node broadcasts frame.

What is Broadcast Domain?

Broadcast domain - all the nodes of a single L2 segment, i.e. the nodes that can reach each other using a broadcast L2 address (ff:ff:ff:ff:ff:ff). See the IP networks section to understand how L2 broadcast domains are used by higher layers.

In the early days of the Ethernet, collision domains and broadcast domains were formed by physically interconnected nodes. I.e. a typical broadcast domain would consist of all the nodes of an L1 segment and such a broadcast domain would be equal to the underlying collision domain. But if the collision domain was a misfortunate byproduct of the direct interconnection of nodes, the broadcast capabilities of such interconnection came in handy. So, the historical fight with collisions didn't affect the borders of broadcast domains.

With the invention of transparent bridges, it became possible to extend broadcast domains without extending collision domains by bridging multiple L1 segments using network switches. Nowadays, hierarchical topologies of interconnected switches are used to form multi-thousand hosts broadcast domains.

Normally, L3 routers form borders of broadcast domains. However, VLAN can be configured to split a single L2 segment into multiple non-intersecting L2 segments, hence - broadcast domains.

Check out the lab on how to use a Linux bridge (virtual network switch) to extend broadcast domains.

VLAN

VLAN - [broadly] any broadcast domain that is partitioned and isolated at the data link layer (L2). Technically speaking, VLAN is a mechanism of tagging Ethernet frames of a single L2 segment with some integer IDs (so-called VIDs).

Frames with different IDs logically belong to different networks. This creates the appearance and functionality of network traffic that is physically on a single network segment but acts as if it is split between separate network segments. VLANs can keep network applications separate despite being connected to the same physical (or virtual) network.

Two Virtual LANs on a single bridge.

Check out the lab on how to set up a simple VLAN using a Linux bridge.

VLAN technology can be seen as inverse to bridging. Bridges merge multiple L2 segments (and broadcast domains) into one bigger L2 segment. VLANs split a single L2 segment (potentially formed by bridging multiple smaller L2 segments) into multiple non-intersecting L2 segments (and broadcast domains).

What is L3 Segment?

L3 segment - same as IP subnetwork (e.g. 192.168.0/24 or 172.18.0.0/16).

Notice, that up to this point we haven't been talking about IP (L3) addressing. Communication within a single L2 segment required only MAC (L2) addresses. We know, that when a node emits a frame with a certain destination MAC address, it'll be delivered by the underlying L2 networking means to the destination node. Additionally, any node can emit a broadcast frame with the destination MAC ff:ff:ff:ff:ff:ff and it'll be delivered to all the nodes of its L2 segment. But how a node can reach another node (on the same L3 segment) by its IP address?

How to send IP packet

First and foremost, IP packets are sent wrapped into Ethernet frames (assuming the Layer 2 protocol in use is Ethernet, of course). I.e. IP protocol data units (packets) are encapsulated in the Ethernet protocol data units (frames).

Thus, the task of sending an IP packet within an L3 segment boils down to sending an Ethernet frame with the IP packet inside to the L2 segment's node that owns that destination IP. Hence, the sending node needs to learn the receiving node's MAC address first. So, some sort of L3 (IP) to L2 (MAC) address translation mechanism is required. This is usually done by a Neighbor Discovery Protocol (ARP for IPv4 and NDP for IPv6) that relies on L2 broadcast capabilities.

When the IP to MAC translation is not known, the transmitting node sends a broadcast L2 frame with a query like "Who has IP 192.168.38.12?" expecting to get back a point-to-point L2 response from the owner of that IP. Such response will obviously contain the MAC address of the node possessing the requested IP. Once the destination MAC address is known for the sender node, it just wraps an IP packet into an L2 frame destined to that MAC address. Thus, an L3 segment heavily relies on the underlying L2 segment capabilities.

L3 to L2 segments relationship

There is an interesting relationship between L3 and L2 segment borders. It's pretty common to have a 1:1 mapping of L3 and L2 segments. However, technically nothing prevents us from having multiple L3 segments over a single L2 broadcast domain.

If stricter isolation is required, VLANs can be configured to split the L2 segment into multiple non-intersecting broadcast domains.

Interesting that in some (rather exceptional) cases, a single L3 segment can be configured over multiple L2 segments interconnected via a router. The technique is called Proxy ARP and it's documented in (rather dated) RFC 1027.

See "L3 to L2 Segments Mapping" and "Proxy ARP" labs.

Crossing L3 borders

Communication between any two different L3 segments always requires a router.

When a node wants to send an IP packet to a node that resides in another L3 segment (IP subnet), it needs to send that packet to a gateway router. Since nodes can talk directly only with other nodes from the same L2 segment, one of the router's interfaces has to reside on the sender's L2 segment. The IP address of the router can be obtained from the routing table every node supposedly should get configured. So, the packet sending procedure is pretty the same as above, but instead of directing an Ethernet frame with the wrapped IP packet to the final destination's MAC address (which can hardly be known to the sender), the node passes it to the router. Routers are usually connected to multiple network segments. So, when a router gets such a frame, it unwraps it and resends the underlying IP packet using one of its other interfaces. I.e. a next-hop router of every router has to reside on one of the L2 segments the router is directly connected to.

VXLAN

VXLAN - another network virtualization technology, somewhat similar to VLAN but much more powerful.

To some extent, VLAN can be considered as an overlay network. I.e. VLAN allows one to create multiple virtual segments on top of an existing network segment. However, there are some significant limitations. VLAN assumes that there is already a broadcast domain underneath, so it'll split it into multiple sub-domains by tagging frames. Additionally, there cannot be more than 4096 VLANs sharing the same underlying L2 segment. There is simply 12 spare bits to encode the VLAN ID field in the Ethernet frame format.

VXLAN technology also creates virtual broadcast domains out of an existing network. So, it's also sort of an overlay networking. However, it does so in a completely different fashion. Instead of relying on the underlying L2 segment capabilities, VXLAN assumes that all the participating nodes have an L3 (i.e. IP) connectivity. On every VXLAN node, outgoing Ethernet frames are captured then, wrapped into UDP datagrams (encapsulated), and sent over an L3 network to the destination VXLAN node. On arrival, Ethernet frames are extracted from UDP packets (decapsulated) and injected into the destination's network interface. This technique is called tunneling. As a result, VXLAN nodes create a virtual L2 segment, hence an L2 broadcast domain.

Of course, nothing prevents us from putting all the VXLAN nodes in a single L3/L2 segment. So, then VXLAN would be just a way to overcome the limitation of VLAN on the number of networks per segment. However, usually, VXLAN is used over multiple interconnected L3 segments.

I'd imagine that most of the real-world VXLANs probably reside in one or few tightly connected data centers. However, since VXLAN requires only IP to IP connectivity of the participating nodes, it essentially allows one to turn arbitrary internetwork nodes into a virtualized L2 segment. While impractical, such a virtual L2 segment can be spanning multiple WANs or even a part of the Internet. Mind-blowing 🤯

From some perspective, VXLAN can be even seen as inverse to VLAN. VLAN splits a single L2 segment (and broadcast domain) into multiple non-intersecting segments that can be used then to set up multiple L3 segments. VXLAN on the contrary can combine multiple L3 segments into one [virtual] L2 segment.

Check out the lab on how to set up a simple VXLAN.

Further Reading

Written by Ivan Velichko

Follow me on twitter @iximiuz