Working On Open-Source High-Speed Ethernet Switch

Various hardware components laid out on a workbench.

Our hacker [Andrew Zonenberg] reports in on his open-source high-speed Ethernet switch. He hasn’t finished yet, but progress has been made.

If you were wondering what might be involved in a high-speed Ethernet switch implementation look no further. He’s been working on this project, on and off, since 2012. His design now includes a dizzying array of parts. [Andrew] managed to snag some XCKU5P FPGAs for cheap, paying two cents in the dollar, and having access to this fairly high-powered hardware affected the project’s direction.

You might be familiar with [Andrew Zonenberg] as we have heard from him before. He’s the guy who gave us the glscopeclient, which is now ngscopeclient.

As perhaps you know, when he says in his report that he is an “experienced RTL engineer”, he is talking about Register-Transfer Level, which is an abstraction layer used by hardware description languages, such as Verilog and VHDL, which are used to program FPGAs. When he says “RTL” he’s not talking about Resistor-Transistor Logic (an ancient method of developing digital hardware) or the equally ancient line of Realtek Ethernet controllers such as the RTL8139.

When it comes to open-source software you can usually get a copy at no cost. With open-source hardware, on the other hand, you might find yourself needing to fork out for some very expensive bits of kit. High speed is still expensive! And… proprietary, for now. If you’re looking to implement Ethernet hardware today, you will have to stick with something slower. Otherwise, stay tuned, and watch this space.

15 thoughts on “Working On Open-Source High-Speed Ethernet Switch

    1. Yep it’s a similar idea targeting the much lower end for in-chassis networking.

      My design is aimed at being competitive with e.g. a Cisco C9300L-48T-4G (1U, non PoE, 48x 1G edge ports) except I’m putting in dual 25G rather than quad 10G uplinks and omitting a lot of the firmware bells and whistles I don’t need (802.1x, IGMP, etc).

      I just want port VLANs, 802.1q, probably 802.3ad on the uplinks eventually, maybe some basic ACLs eventually, ability to force speed/duplex, TDR cable testing, performance counters, probably span analysis eventually, and SSH management – on a dedicated RGMII port that is (by design) not bridged to the fabric so you can have completely isolated management if you want.

      1. Sorry, I should have clarified that I meant on the RTL side. I’m still using the Xilinx tools for synthesis and P&R, although it’s the free edition (this is the largest FPGA supported by said free edition).

        I’m 100% open on the RTL other than using the Xilinx logic analyzer IP for debug (which won’t be in the final shipped firmware obviously). Most notably I am not using the transceivers wizard; I’ve RE’d enough of the undocumented black box parameters that I can run 10Gbase-R and QSGMII using raw GTYE4_CHANNEL and GTYE4_COMMON primitive instantiations and some simple wrappers around them.

        It’s been a while since I’ve played with the open tools. I love the idea, but none of them are both mature enough and have good enough support for large/high end parts to be useful to me yet. Give it a few years maybe.

        1. “Give it a few years maybe.”

          I mean, not likely, not for stuff like this. The vendor controls the hardware, and when you’re working literally at the bare-metal level it’s just not really worth it to try to RE it. Especially because for some of this stuff Xilinx just patches hardware bugs in RTL and never bothers documenting stuff because they don’t have to support anything other than their own hardware.

          1. I’m aware of some of this – the GTY CPLL has a silicon errata where you have to measure the output frequency, see if it’s in range, and if not then poke some undocumented registers semi-randomly, reset it, and try again until it works.

            I’m not using the transceivers wizard and haven’t implemented the workaround in my wrapper yet so for now I’m just using the QPLL. This design can be completely achieved with the QPLL only so I’m in no rush.

  1. While I fully get the foss sentiment, using propriatery tools and 4k chips makes it not really something worthwhile ‘for the rest oF us’.

    I dont know the purchaseabulitt of realteknchips, and they are histile ish with NDA’s (but plenty of leaks) the rtl8k and rtl9k series do pretty much this, at a fraction of the cost. Also work is happening on openwrt for them (i have paused my work for the time being on them though).

    The chip is closed source of course, but all software is open. I doubt this is different with fgpa’s vy much, as theres also propiatery IP blocks there …

    But I do much appreciate the effort. We need more dedicated hacjers :)

    As for scopes, the mso5k had its source leaked so lots of hacking possible there!!

    1. The alternative is to use switch ASICs where you have to sign an NDA and have a volume sales agreement to even get a datasheet. I started this project because going FPGA-based was the only way to have a properly OSH friendly switch, at least until open ASIC PDKs reach the point that gigabit SERDES IPs and large dies and FCBGA packages are viable. I don’t want to use NDA’d parts even if the stuff has leaked because I don’t want to give any money to people encouraging such business practices.

      And while the chip is close to $4K at Digikey, they can be had much cheaper if you don’t mind a sketchy overseas source that’s probably selling you a reball. LCSC currently has them for around $100 in qty 1 down to $85 in volume. At that point for a DIY/low volume build, the ten-layer PCB is going to be the dominant cost (probably high 3 digits USD for a batch even at my usual Shenzhen fab, close to $10K in the USA).

  2. Interesting project, but the last two switches I bought for my home networking closet were around $29 on Goodwill. 48 port gigabit POE rack mount switches. I think they were Netgear, but I also have a couple of HP ProCurves. When the big corps go to 10G on the workfloor, they recycle this perfectly good high quality gear. And I feel better that I have given it a second life as opposed to a trip to the recycler.

    1. I’ve been using $50 ebay’d Ciscos for a long time, but they don’t have 10/25G uplinks and I don’t exactly trust them as much as I’d like to given they weren’t purchased new. I had been planning to replace them “any year now” with my own designs since about 2015 (when they were already starting to get old) but I ended up taking longer to get to this point than I had hoped.

      The end goal of this project is mostly for fun and learning, but also to have a switch that I can trust as completely as possible without fabbing my own fabric ASIC, decapping a few at random, and diffing SEM images against the GDS.

      1. Also I forgot to mention, power is the other huge consideration.

        My current Ciscos pull about 150W each for 24 gigabit ports. Multiply by four and that’s a fair bit of juice.

        My FPGA design is targeting sub 50W for 48 ports, a 6x reduction in power. Cutting ~500W off my power bill will save me something like $70 a month in lab operating costs and substantially prolong my run time on UPS/generator during power outages, to say nothing of the environmental benefits.

      2. Designing for learning is a very valid reason. Since I don’t need the 10G uplink capability, I’m happy with my surplus switches. I have found they’re more resistant to induced voltage on the inputs from nearby lightning strikes than the consumer grade ones I’ve had. I live on top of a hill and we occasionally get strikes that are close enough to fry my garage door openers. The commercial grade switches seem to actually have protection on the individual ports, as opposed to the consumer grade switches which blow ports when you look at them sideways!
        When you mentioned the 150W for the PoE ports…surely that is only when they’re actually pulling PoE current? I should measure the line current my HPs pull. The only PoE devices I have currently are some SIP phones I have been playing with.

        1. No. These switches are not PoE capable (I have a separate injector for my APs and cameras). They pull ~150W actual measured load on the UPS just sitting there pushing packets under real world load levels.

          As compared to my design which is aiming at circa 50W TDP rather than typical usage (all ports lit up, most things at heavy load, etc).

          My line cards should be fairly robust, they have ESD224s (12 kV contact / 15 kV air gap discharge rating) on all of the diff pairs plus whatever protection is in the PHY itself.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.