Manta: An Open On-FPGA Debug Interface

Manta project logo - a manta ray, with cursive 'manta' written next to it

We always can use more tools for FPGA debugging, and the Manta project by [Fischer Moseley] delivers without a shadow of a doubt. Manta lets you add a debug and data transfer channel between your computer and your FPGA, that you can easily access with helpfully included Python libraries.

With just a short configuration file as input, it gives you cores you add into your FPGA design, tapping the signals of interest as an FPGA-embedded logic analyzer, interacting with registers, and even letting you quickly transfer tons of data if you so desire.

Manta is easy to install, is developer-friendly, has been designed in Amaranth, and is fully open source as you would expect. At the moment, Manta supports both UART and Ethernet interfaces for data transfer. As for embedding the Manta cores into your project, they can be exported to both Amaranth and Verilog. You should check out the documentation website — it contains everything you might want to know to get started quick.

The Manta project has started out as our hacker’s MIT thesis, and we’re happy that we can cover it for you all. FPGA-embedded logic analyzers are a fascinating and much-needed tool, and we’ve had our own [Al Williams] tell you about his on-FPGA logic analysis journey!

30 thoughts on “Manta: An Open On-FPGA Debug Interface

  1. Logic analyzer projects to poke around inside FPGA’s are fairly common (The Manta project names a few others), but from what I have seen they tend to have a very sparse frontend.

    I find it quite strange that people who make these (as open source versions) have not adapted Sigrok / Pulseview to visualize the data.

    Sigrok / Pulseview has received very little development in the last few years and that is quite sad. I don’t really have the programming skills to improve on this, but my hope still is that some day Sigrok is going to be supported with some native hardware, and can generate some revenue from that.

    1. SDRsharp was the (free) goto application in the SDR (Software Defined Radio) world, that runs on Windows. And then they produced and started to sell their own SDR hardware (AirSpy R2, Airspy mini, Spyverter, Airspy Discovery HF + and Youloop antennas). So what you suggest can happen (The source code for SDRsharp was available for inspection only at one stage in its history).

    2. Good observation, you’re right that most logic analyzers don’t include their own frontend!

      At least from what I can tell, this is because the community has separated the problem of displaying waveforms (with tools like Surfer, GTKWave, PulseView, and others) from the problem of generating them. Logic analyzers (like Manta and the others I’ve listed on the ‘Alternatives’ page of the docs) aren’t the only programs that generate waveforms – simulation tools (like Icarus Verilog, Verilator, Questa, and others) also produce them. In order to not need to bundle a waveform viewer with every application, folks just decided to split this into two separate problems, and call it a day.

      One nice consequence of this is that you can actually open the VCD files that Manta generates inside PulseView! Although last time I tried this I think PulseView crashed, but I might have compiled it incorrectly.

  2. I just started reading Fischer Moseley’s well-written Masters thesis. In Chapter 1, he lays out his motivation for developing an open-source FPGA environment, and I can tell you, his was mostly a practical play not an idealistic one. Xylinx would do well to pay attention here, because corporate inattention to their Vivado EDA tools clearly drove the MIT EECS undergraduates and teaching staff nuts, resulting in the company squandering goodwill at a world-class university. In summary, the author’s herculean efforts were in response to Xylinx not realizing that MacOS and Apple Silicon were super popular with engineering students, and for allowing the company’s EDA tools to become “bloat-ware.” According to the author, Xylinx proprietary EDA tools consume 100GB of disk space. Since laptop SSD’s are expensive, some students bought 128GB models, and there Xylinx EDA tools consumed 78% of available disk space for no good reason! To be fair, I’ve seen other companies make the same mistake. In an era when Windows was king, engineering companies happily bought PCs for their hardware developers, but the world is changing, and this approach won’t work for future engineers or in a world where ARM is making Intel squirm. Would you abandon your MacBook Air just to get through one course?

    1. Most of the engineering tools needs real computers with real performance. The world isn’t changing: mac isn’t for hw development. You have to choose the right tools.

      1. Xilinx actually mostly forces you onto Linux, not Windows. Their entire Petalinux toolchain requires Linux and they create absurdly deep directory structures that break NTFS.

        Of course they actually force you onto a specific distribution and basically don’t even let you upgrade, so the whole thing’s a major disaster and you need a VM and a main PC with 64+ GB of RAM anyway for the major devices.

        The disk usage is absurd though, especially because it’s common to have multiple entire versions installed. I have an entire 2 TB disk dedicated to Xilinx software installs.

          1. You and I have different definitions of “work.” You can’t even properly do the development under Linux! It needs to be in a VM unless you like 5 year old security issues, because their BSPs stopped working in 2019 and more recent distros changed behavior and the Yocto build is so old they refuse to support it.

            And the response to “so what’s bad about a VM” is these things can use 8-12 GB of RAM *per instance* and there can be as many as 3 instances running.

            And I haven’t even mentioned the fact that their tools completely break Git tools. They *hard code paths* in Vitis so the integrated Git tools (because Vitis is Eclipse) just break.

    2. 100GB? Ha! That doesn’t cut it for the 2023 version. I have a clean install here and it is 178GB without any extra BSPs installed. It’s insanely bloated, and you pretty much do not get the option to do any partial installs.

      I don’t even DO FPGA development, I do software development, but for microblaze code you need Vitis, and that requires that you also install Vivado, and that requires a bunch of other things. So, I needed a bigger laptop as my older 500GB model wasn’t cutting it anymore.

      1. The large size comes from the Versal/Alveo support: if you remove those it drops quite a bit (but still large). Even more if you drop US/US+ but that’s more likely useful.

        It’s still nuts though. It’s not exactly “bloated” – the model files for larger devices are just that big – but the fact that it can’t just do on-demand device loading – and share between versions! – is nuts.

        1. About 60-70 GB. They don’t let you pare it down by individual devices, which is dumb, it’s by family. It’s still psychotically big, and again that’s not even talking about the Petalinux tools.

          Not joking that I have over 1 TB dedicated to Xilinx tools. 3 versions, plus 2 VMs for 2 separate Petalinux versions.

  3. Nice to see Amaranth getting more cool tools like this!
    But it’s not clear to me why the author didn’t reuse the etherbone / wishbone bridge protocol that’s been around for a long time, and already has good library and tool support.

    1. Agreed! Manta’s original codebase was actually in Verilog, but I ended up rewriting in Amaranth which was absolutely the right choice. Being able to describe the code running on the FPGA in the same classes as the code running on the host has been incredible for this project.

      I shied away from Etherbone/Wishbone because Manta really doesn’t need it! Manta’s model for transferring data between the host machine and the FPGA lacks flow control or arbitration, which means that it ends up being overkill for this project. As a result it implements its own super simple internal bus, which has some nice implications for resource utilization and timing. More details in the thesis if you’re curious!

    1. Unfortunately accessing the FPGA’s JTAG from within the fabric is super difficult to do, and it’s even harder to do that in a device-agnostic way. For Xilinx devices at least it requires using some of their IP, which runs slightly antithetical to the original goals of the project.

      Manta has been able to handle some amount of device-dependent behavior in it’s Ethernet interface, but that’s because it’s been able to lean on the LiteEth project which handles that itself. As far as I’m aware, no similar project exists for JTAG interfaces. If you know of one, please let me know!

      1. I don’t entirely understand the “using some of their IP” comment, as the only thing it requires is instantiating a BSCAN object, and that’s not IP, that’s just a component?

        Unless the idea is “I don’t want to instantiate any vendor objects” – in which case that’s still not an IP issue, it’s a failing of the HDL because it has no way to infer JTAG primitives (*all* HDL is fundamentally inferring FPGA primitives from HDL).

        It’s true the JTAG access methods are kinda-sorta device specific, but, I mean, that’s what wrappers are for.

        Of course if you really want to maintain a “device agnostic” ‘barrier’ you just expose a byte-wide transmit/receive command interface (which you already have via the UART) and let the user read and supply the bytes however they want. Doing that over JTAG is fairly trivial, a 9-bit user register (one bit for valid) which reads data on capture-DR and writes it on update-DR is easy.

  4. Surely the real question is: do we yet have a reliable, well documented, easy to use open-source linux-compatible tool compaitlbe with multiple brands of FPGA chip which can be used to write “programs” for FPGAs and flash them to the chips? (even if the “programs” it produces can’t make full use of all the fancier extra features manufacturers add within the hardware). Thanks

    1. Yeah, F4PGA is basically the leading contender there. It’s doable, albeit not great (although my standard of ‘great’ is pretty high given my typical use case).

        1. F4PGA is a set of open-source toolchains for building source code into a bitstream that can then be flashed to the FPGA – but F4PGA itself doesn’t actually program that bitstream to the chip. There’s a suite of other tools that do that – openFPGALoader handles most chips from most vendors, but there’s also xc3sprog for Xilinx devices and iceprog for Lattice parts if you feel like using a vendor-specific open-source tool.

          1. “but F4PGA itself doesn’t actually program that bitstream to the chip”

            You don’t actually need a tool for any of the 7 series guys, though. Just program the flash chip directly, there are buckets of tools for that.

  5. Looks like it’s only suitable for capturing very short spans, with this storing in a block ram thing. A more robust solution is to push sampled data into a FIFO and have an AXI master on the other end to send data in bursts to a buffer in DDR.

    1. That assumes that you have DDR onboard your device to store in – and that you weren’t already using it for your own code! And that you’d be okay with instantiating a memory controller for your debugger and the configuration effort and resource utilization that would entail.

      These conditions weren’t met in the original use case that Manta was created for. If you’re mentioning an AXI master then I think you might be coming from the Xilinx ecosystem – for whatever it’s worth, the Xilinx ILA stores its samples in Block RAM, not DDR. This seems to be true for most of the logic analyzers that I’ve stumbled across, but do let me know if you know of one that’s different!

      1. True, but the same goes for “how many block rams you’re willing to sacrifice”. If you’re debugging a design destined for a limited resources FPGA, it often make sense to do it on a larger FPGA of the same family, and this is exactly where you can put all your harnessing – DDR interfaces and controllers included.

        And yes, this is why I do not use ILA – I always need longer traces than what it can humanely store, and most of my work is on SoCs (like Zynq UltraScale+ and Cyclone V) anyway, so I have a hard CPU and a fast DDR access for free. I wrote my own logic analyser core for this purpose, it’s fairly trivial, and I know a few others who routinely do the same. Did not bother looking for existing open source ones, but if there’s a demand, I can publish mine.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.