Spanning The Tree : Dr Radia Perlman & Untangling Networks

As computer networks get bigger, it becomes increasingly hard to keep track of the flow of data over this network. How do you route data, making sure that the data is spread to all parts of the network? You use an algorithm called the spanning tree protocol — just one of the contributions to computer science of a remarkable engineer, Dr. Radia Perlman. But before she created this fundamental Internet protocol, she also worked on LOGO, the first programming language for children, creating a dialect for toddlers.

Born in 1952, Perlman was a prodigy who excelled in math and science, and in her own words, “Every time there was a new subject or a quiz I would be very excited at the opportunity to solve all sorts of puzzles”. She graduated from MIT in 1973 and got her Masters degree in 1976.

While she was working on her Master’s degree, she worked with Seymour Papert at the MIT Artificial Intelligence Lab, which was working on LOGO, the first programming language for children. In the simplest version of this language, kids could learn the fundamentals of programming by writing programs that controlled the motion of an on-screen or motorized turtle. Perlman created Toddlers Own Recursive Turtle Interpreter System (TORTIS), a simplified version of LOGO that could be used by pre-school children. This was controlled by buttons that allowed the toddler to experiment with a Logo turtle with a less intimidating interface than existing systems. “Most important of all,” the abstract for her paper describing the project says, “it should teach that learning is fun.”

Spanning Tree

After getting her masters and leaving MIT, Perlman joined BBN, a defense contractor, then moved to Digital Equipment Corporation (DEC) in 1980. At DEC, she was tasked with looking into ways to deal with the increasing complexity of the local area networks (LANs) that the company was creating. Specifically, how do you stop data from getting trapped in a loop?

Imagine a small network of two computers connected to a router. When one device sends out a data packet, the switch (a router that includes the ability to switch data between multiple ports) passes it onto the other device. But LANs are seldom that simple, and with more computers being connected to the nascent Internet, it was getting harder to figure out how to effectively and efficiently route data between multiple switches. To create networks that could survive the loss of a connection, a switch would be connected to multiple other switches. That way, if one connection failed, you had a backup. You also have multiple possible ways to route data, which creates the possibility of loops. A loop is where one switch passes a packet of data to another, which then sends it back to the first. The first switch then resends it to the second, and so on. That can create a broadcast storm, where the links between switches become filled with these echoing packets.

So how do you avoid this? Perlman’s solution was deceptively simple. She created the Spanning Tree Protocol (STP), which allows each switch in the network to figure out its place in the scheme of things. This innovative standard was codified in the IEEE standard 802.1D, last revised in 2004.

Each switch is assigned a number by the network designer, called the Bridge ID (BID) which indicates its prominence in the network. The switch with the lowest BID is designated as the root bridge. If you think of it as the root system of a tree, the root bridge is the base of the trunk, the point where the root system begins. Below this, each switch is a branch of the root system, branching off and reaching the devices at the root tips.

It’s more complex than a tree root, though: there are multiple links between root branches, and sometimes these links get cut. To negotiate this, each of the switches periodically sends out special data packets called Bridge Protocol Data Units (BPDUs) that contain the BID and another number called the path cost for each connection it has available. This number is calculated from the bandwidth that the connection offers: the lower the path cost, the higher the bandwidth the connection offers. As each switch receives the BPDU from its neighbors, it uses the path cost to decide which connection to use: the lower the cost, the more favorable the connection is. It then sends out its own BPDUs to neighbors, which use this information to calculate which connections to use. This then propagates through the network until every switch knows its place, and which routes are most efficient to use.

If a connection fails, the switch sends out a specially flagged BPDU that contained the details of the remaining connections, which the other switches use to recalculate. In effect, the network would route around the damage and just keep working.

It is such a simple solution that it now appears obvious, but Dr Perlman struggled to get her fellow engineers to accept it, because it was so simple. Many engineers struggled to accept that the answer to the problem could be that simple. “My designs were so deceptively simple that it was easy for people to assume I just had easy problems,” she told the Atlantic, “whereas others, who made super-complicated designs (that were technically unsound and never worked) and were able to talk about them in ways that nobody understood, were considered geniuses.”

The beauty of the STP is that none of the routers need to know the details of the entire network to work: all it needs to know is its own BID and the route cost of the connection to its immediate neighbors. The inverted tree of the network is automatically created by each of the switches using the information it has to decide how to route data. It’s perhaps best described by a poem that Perlman herself wrote in the patent application for the standard and the academic paper describing the protocol (PDF link):

Algorhyme

I think that I shall never see
a graph more lovely than a tree.

A tree whose crucial property
is loop-free connectivity.

A tree that must be sure to span
so packet can reach every LAN.

First, the root must be selected.
By ID, it is elected.

Least-cost paths from root are traced.
In the tree, these paths are placed.

A mesh is made by folks like me,
then bridges find a spanning tree.

Although the original STP has since been replaced by enhanced and updated versions such as the Rapid Spanning Tree Protocol (RTSP), the fundamental idea of a self-organizing, adaptive network remains one of the underpinnings of networking and the Internet. Many of these replacement protocols have been created with the input of Perlman, who continues to work on networking standards such as and TRansparent Interconnection of Lots of Links (TRILL). Although the networks of the Internet are now far more complex than those she first thought about in the 1970’s, the survival of these networks owes much to her innovative, and we would say creative, thinking.

30 thoughts on “Spanning The Tree : Dr Radia Perlman & Untangling Networks

  1. “…the switch (a router that includes the ability to switch data between multiple ports) …”
    In studying for my network+ and CCNA, it seems like switches are described as moving frames between hosts on a network, and routers move frames between networks. And some routers double as switches. But a switch does not a router make.

      1. Uhg, why would you use a crappy SG500 as an example of an L3 switch? At least reference something professional like a Cat3850, 3650, or even a 3750…

        I have to agree though that a L3 switch is not a router and a router is NOT a switch. You can add an EHWIC or a NIM to an ISR but that is not built into the router, it is a switch module connected to the midplane/backplane of the router.

        Conversely an L3 switch typically does not have nearly enough RAM to be a full featured router. They also do not perform packet inspection in any form – lack NBAR, are not stateful, have very limited ACL capability. Yes they have routing capability for IGPs and limited exterior protocols such as eBGP… But they don’t have enough RAM to carry the full RIB until you get into the big iron like 4500 SUP8E, 6800, Cat9K… and the routing on those is done by the Supervisor, or an additional DFC. Which is arguably a router in the chassis to which the linecards are attached.

        Put simply, these products serve different purposes but the overlap is significant to be sure. Please don’t confuse routers with switches/hubs/bridges. We are talking about a distinct L2 protocol here and it’s long overdue demise. STP is slow, doesn’t scale well and most people don’t truly understand it on the level they should if they’re going to use it. Far too often a find root switches assigned at random because no one changed the priority etc.

        1. Uhg, why would you use a crappy SG500 as an example of an L3 switch? At least reference something professional like a Cat3850, 3650, or even a 3750…

          Because, elitism aside (no the SG500 cost the same as a car, it isn’t a $10 cheapie either), it is an example of a device that has features of both a router, and a switch, indicating that you can in fact, have a device which offers both.

          The routing functionality is exactly that, bare bones static routing. No fancy dynamic stuff, just the basics. Does it do packet filtering? No, that’s what a firewall is for. The only two things in favour of these things is:

          1. they don’t take up any additional space
          2. since the router is embedded in the backplane of the switch, you don’t have the bottleneck that you’d have with a switch plugged into a (likely trunked) port. (This; plus the fact that our core router doesn’t support enough VLANs because Cisco deliberately hobbled it… are the reasons my workplaces uses this feature)

          I agree it goes downhill from there. The SG500 for example will do IPv4 routing, but not IPv6, which is one thing I don’t like about the one at work. If I want serious routing capabilities, I wouldn’t go looking for that in a switch… but such things do exist.

          1. I believe you both are right. In fact, it all comes down to CAPEX and OPEX. The SG family is capable of inter-VLAN routing but that is tied to performance hits as the routing is done in SW (in contrast to switching happening in the ASIC). So, depending on your requirements (both logistical and performance) you may see this as a viable solution

    1. I think a lot of the confusion is due to most, if not all, consumer devices including both functions in one box. Often with Modem and WiFi Access Point as well.

    2. A “switch” is a LAN device that broadcasts “all” packets received in the upstream/downstream port to all the other ports irrespective of weather the intended recipient device is on that particular port or not. A switch is not aware of which devices are on which ports. In the other direction any packets received on the other ports are switched to the upstream/downstream port.

      You usually have a single device on each port of a switch so that extra packets are simply discarded. This does reduce the available bandwidth “to” the end device but at the same time causes some traffic balancing between these devices. A switch can’t really be used between networks as the extra un-routed traffic will cause link saturation.

      A “router” is aware of destination locations (or devices) and routs packets to the exact port that the recipient device is on or through (to another router). So routers are used for links between networks.

      1. RÖB – your definition of ‘switch’ sounds more like my definition of ‘hub’. A hub will certainly rebroadcast everything to every port.
        A switch (even an unmanaged) switch builds and maintains a table to learn what devices (mac ids) are connected to which port. If it receives a packet whose destination is in the table, then it sends it only to that port. If it’s an unknown destination (or broadcast) packet, then it does send it everywhere, much like a hub.

        I think I’ve stopped thinking of switches and routers as discretely defined devices. As Jonathon mentioned, there are so many devices that combine functions (also home, commercial, industrial, telecom infrastructure, etc) that it’s more of a spectrum kind of definition.

      2. No, a switch does NOT broadcast all packets to all ports irrespective of whether the intended target of the packet or not. You couldn’t be more wrong about that if you tried. The whole point of a switch is that it switches destination port according to the packet destination. It’s where its name comes from.

        A switch learns (or is configured to know) which MAC addresses are associated with each port and will only transmit unicast packets destined for that MAC, multicast packets for any multicast networks that the MAC is a part of, and broadcast packets for the network that the MAC is associated with according to the ARP table. They come in ‘dumb’ varieties such as that you’ll find in BestBuy (eg NetGear FS108, GS108), that will simply learn which MACs are where; and enterprise varieties (eg Cisco 3760X series) with a huge number of features including VLAN support, ACLs, STP to name just 3 of the most oft used. A switch operates at Layer 2, although enterprise varieties often have Layer 3 abilities too (eg the ability to assign VLAN IPs directly to a port and “route” IP packets between them). Enterprise switches have a backplane with a massive bandwidth able to handle traffic from many saturated input ports simultaneously, switching each packet to the appropriate output port.

        What you described is a hub, which is a dumb Layer 2 only device. A hub does not scale well as every port sees every packet. If you have 8 hosts simultaneously sending large numbers of packets on a 100Mb/s hub then you will saturate it – each port will only be able to put an theoretical average of 12.5Mb/s onto the backplane because the ports can only output 100Mb/s and they have to transmit every packet. That leads to lots of collisions and retried packets. In practice, you’ll only get about 85-90% of that bandwith.

        A router (historically) is purely a Layer 3 device, connecting two or more IP networks together and routing packets between them according to a routing table. For small internal networks that table may consist of only static routes – for edge routers that table will likely have a dynamic element to it, such as BGP, where the router learns the best route to far off networks. (eg, a corporate router service by 2 different ISPs).

        You won’t see many pure routers any more unless you work in a major ISP. “Switch if you can, route if you must” became the mantra. Firewalls became important, operate at Layer 3, and include routing abilities that handle most office needs. It’s common to see switch stacks hanging off of firewalls, with those firewalls connected directly to the upstream connection.

      3. A “hub” broadcasts all packets received, to all other ports, regardless of whether the target device is on that particular port or not.

        A “switch” usually uses the MAC address to determine which port is the target. Some switches also use the IP address.

        A “router” uses IP addressing, to transfer data between 2 or more networks.

  2. While STP, RSTP, and PVSTP+ are staples of smaller networks, more and more you’ll find larger corporate networks as well as datacenters moving off of layer 2 protocols like those listed above. As companies move layer 3 networking like BGP and OSPF closer to the end client machines, STP and the like are becoming more a thing of the past. With each step closer to the end user/client/server layer 3 offers more routing choices, better failover, and the ability to route around maintenance windows and outages. Facebook’s code blog has some good articles on how they implemented a solution, as does AWS, to the point now where via layer 3 routing a machine/vm can be drained for maintenance at the network level instead of at layer 5-7 (load balancer).

  3. Overall a good article. Certainly Radia Perlman has contributed much to the networking community. Honestly I’d argue that her work on IS-IS enabled Internet growth much more than STP.

    However, I think the author has taken some artistic license that goes a bit too far. It must be said that labelling STP as a “fundamental Internet protocol” is more hyperbole than fact. There’s also some mixing of the terms bridge, switch and router that clouds some explanations. Did STP allow localized networks/broadcast domains to scale? Absolutely. However, that’s at best many steps removed from connecting STP to the growth of the Internet.

    Funamentally, STP functions one layer below where IP functions and isn’t fundamentally needed for IP to function. The fact that STP is an IEEE standard and not an IETF is also indicative of where STP sits in the network stack.

    One more protocol that some may find of interest is REP. It’s mostly useful for managing Ethernet based network rings (e.g. a metro ring), and is probably outside the scope of LANs, but is a related protocol that some may find of interest.

    1. If you are interested in REP (which is a closed Cisco standard, I believe), I would urge you to also look at ITU G.8032 (aka ERPS) as an alternative open standard for protection switching in Ethernet rings. There’s also ITU G.8031, which is for linear protection switching. I am not sure how widely the ITU standards are used in the real world compared to the vendor proprietary stuff. Certainly many vendors support them nowadays.

    2. In my experience as a programmer, STP is by far the part of networking most likely to be referred to as important, or used as a technical example to explain an idea.

      Your complaints are weird, it seems you’re attacking the importance of the fundamental parts by pointing out implementation details that naturally came later and got hung on different pegs.

      The existence of IP tunneling does not, in any way, reduce the importance of the physical networking protocols that give us a reason to even think about wanting IP-this-and-that.

      Reading your response I start to understand the reasons why she doesn’t like being called the “Mother of the Internet.” When I was reading her explanation I was only thinking of people like that guy who claims to have invented email!

      1. In my experience as a network architect, bridged STP is 2 to 3 orders of magnitude less important on the history of IP routed internet than any of the IP routing related work Radia did.

  4. …years ago when i was bored at my last place of employeement, I’d plug a few cat leads into various random ports of differing switches just to flood our network. No STP there! Doing this would (1) …give me something to do sometimes for up to half an hour and (2) ….the managers thought I was a genius when I “found and fixed” the problem within a few minutes.

  5. No, switches do not forward all packets to all ports.
    That would be a hub.
    A switch builds a table of ports with associated VLAN’s and forwards broadcasts only to those ports on that VLAN. In 2005 I experimented with using OSPF down to the Access layer (6500 w/Sup32) and it was nice to be able to shut down STP. However nowadays most places are selectively removing STP and regaining that 50% bandwidth loss on the redundant 2nd uplink by using MCEC, VlexLinks, etc.
    Although I’ve mostly used enterprise 37-3850, Nexus machines.

  6. Spanning Tree is not just not a “fundamental Internet protocol” – it is not even an Internet protocol. It only works on Ethernet-based LANs, so it is at maximum an Intranet protocol, and the messages are exchanged on Layer 2 too.

    1. Wait, you’re saying that ethernet is for LANs? I guess token rings are for the big interconnected networks?

      Pedanticism run amok; there is not a formal definition of “fundamental internet protocol” for you split hairs over! When you disagree with the terms, but the terms aren’t formal, you’re almost certainly making a false claim. As here.

      You can have a different opinion and be correct, but you can’t reject opinions as incorrect. You only get to have your own.

      1. The IEEE 802 group of standards, which Ethernet, the Spanning Tree Protocol and Token Ring are part of, is literally defined by “The IEEE 802 LAN/MAN Standards Committee”. You see the “LAN/MAN”? You can’t build Internets without layer 3 addressing and routers. Neither Spanning Tree nor Ethernet nor Token Ring care about that, they only do Layer 2. Also if Spanning Tree was an actual Internet protocol, it would be an IETF standard.

        (Dude, if you want to lecture people on standards and terminology, then af least make sure you know what you’re talking about. I bet you’ve never even read an IEEE standard. Also it’s FDDI you meant to say instead of Token Ring.)

  7. Spanning Tree is a pita. the tactical use of well placed, well configured routers in a network requiring spanning tree is a much better way of doing things with less down time should something fail.

  8. Just a minor detail: the text for the TORTIS link should be: “Toddler’s Own Recursive Turtle Interpreter System”
    (Although I admit that “Turtles Own Recursive Turtle Interpreter System” sounds way more recursive… ;-) )

    Best regards and thanks for a great site.
    A/P Daniel F. Larrosa

  9. It’s time for a new way to connect computers.
    The interconnect needs in 2018 are vastly different from what they were in 1990. The limitations of the star serial “push” approach (now known generally as switched networks), with control planes that constrain the flow of data rather than promote it, are increasingly evident, and are now constricting the path forward.
    The solution requires a complete rethinking of the data transfer problem, resulting in a new and better interconnect, based on a completely different set of concepts. More on this topic is discussed in the following whitepaper: https://lnkd.in/gsKcrFv

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.