How To Build Jenny’s Budget Mixing Desk

Jenny did an Ask Hackaday article earlier this month, all about the quest for a cheap computer-based audio mixer. The first attempt didn’t go so well, with a problem that many of us are familiar with: Linux applications really doesn’t like using multiple audio devices at the same time. Jenny ran into this issue, and didn’t come across a way to merge the soundcards in a single application.

I’ve fought this problem for a while, probably 10 years now. My first collision with this was an attempt to record a piano with three mics, using a couple different USB pre-amps. And of course, just like Jenny, I was quickly frustrated by the problem that my recording software would only see one interface at a time. The easy solution is to buy an interface with more channels. The Tascam US-4x4HR is a great four channel input/output audio interface, and the Behringer U-PHORIA line goes all the way up to eight mic pre-amps, expandable to 16 with a second DAC that can send audio over ADAT. But those are semi-pro interfaces, with price tags to match.

But what about Jenny’s idea, of cobbling multiple super cheap interfaces together? Well yes, that’s possible too. I’ll show you how, but first, let’s talk about how we’re going to control this software mixer monster. Yes, you can just use a mouse or keyboard, but the challenge was to build a mixing desk, and to me, that means physical faders and mute buttons. Now, there are pre-built solutions, with the Behringer X-touch being a popular solution. But again, we’re way above the price-point Jenny set for this problem. So, let’s do what we do best here at Hackaday, and build our own.

The Physical Goods

What we need is a microcontroller that has native USB client support, multiple digital I/O pins, and some analog inputs. I went with the Arduino MKRZero for the small size, decent price, and the fact that it’s actually in stock at Mouser. The other items we’ll need are some faders and buttons. I went for the full-sized 100 mm faders, and some LED toggle buttons made by Adafruit. The incidentals, like wire and resistors, was sourced from the local parts bin in the corner.

My first thought was to design and 3D print the panel, but after doing the layout on a scrap piece of plywood, the resulting size proved a bit too large for my printer. So we’re going retro, and making a “wood-grain” mixing desk. This would be a great project for a CNC router, but as I’m not part of that particular cool club yet, it was a drill press, table saw, and oscillating tool to the rescue. The results aren’t quite as pretty as I wanted, but maybe we’ll get a Mark II of this project one day.

The potentiometer here should be 10K.

The wiring is relatively straightforward, with a current limiting resistor to protect the LEDs inside the buttons, and a pullup resistor to prevent the digital pin from floating when the button isn’t pushed. Now, that pullup might not be necessary, as I later learned that the Arduino has built-in pullup on its digital pins. And also of note, a 10 Ω resistor is *not* a good choice for a pullup. As Al eloquently put it, that’s a “pull way up resistor”. 10 kΩ is the better choice.

And to finish the build, we’ll need a sketch to run on the Arduino. Thankfully, there’s already a great library for exactly what we want to do: Control Surface. There’s a bunch of ways to set this up, but my sketch is pretty trivial:

#include <Control_Surface.h>
USBMIDI_Interface midi;

CCButtonLatching button1 {11, {MIDI_CC::General_Purpose_Controller_1, CHANNEL_1}, };
CCButtonLatching button2 {10, {MIDI_CC::General_Purpose_Controller_2, CHANNEL_1}, };
CCButtonLatching button3 {9, {MIDI_CC::General_Purpose_Controller_3, CHANNEL_1}, };
CCButtonLatching button4 {8, {MIDI_CC::General_Purpose_Controller_4, CHANNEL_1}, };
CCButtonLatching button5 {7, {MIDI_CC::General_Purpose_Controller_5, CHANNEL_1}, };
CCButtonLatching button6 {6, {MIDI_CC::General_Purpose_Controller_6, CHANNEL_1}, };
  
CCPotentiometer volumePotentiometers[] {
  {A0, {MIDI_CC::Sound_Controller_1, CHANNEL_1} },
  {A1, {MIDI_CC::Sound_Controller_2, CHANNEL_1} },
  {A2, {MIDI_CC::Sound_Controller_3, CHANNEL_1} },
  {A3, {MIDI_CC::Sound_Controller_4, CHANNEL_1} },
  {A4, {MIDI_CC::Sound_Controller_5, CHANNEL_1} },
  {A5, {MIDI_CC::Sound_Controller_6, CHANNEL_1} },
};
void setup() {
    Control_Surface.begin();
}
void loop() {
    Control_Surface.loop();
}

Pipewire to the Rescue

And now on to the meat and potatoes of this project. How do we convince an application to see inputs from multiple devices, and actually do some mixing? The problem here is de-sync. Each device runs on a different clock source, and so the bitstream from each may wander and go out of sync. That’s a serious enough problem that older sound solutions didn’t implement much in the way of card combining. Not long ago, the process of resampling those audio streams to get them to properly sync would have been a very CPU intensive procedure. But these days we all have multi-core behemoths, practical super-computers compared to 20 years ago.

So when Wim Taymans wrote Pipewire, he took a different approach. We have enough cycles to resample, so Pipewire will transparently do so when needed. Pipewire sees all your audio interfaces at once, and implements both the Jack and Pulseaudio APIs. Different distros handle this a bit differently, but generally you need the Pipewire packages, as well as the pipewire-jack and pipewire-pulseaudio packages to get that working.

And here’s the secret: The Jack routing tools work with Pipewire. The big three options are qjackctl, carla, and qpwgraph, though note that qpwgraph is actually Pipewire native. So even if an application can only select a single device at a time, if that app uses the Jack, Pulseaudio, or Pipewire API, you can use one of those routing control programs to arbitrary connect inputs and outputs.

So let’s start with the simplest solution: jack_mixer. Launch the application, and then using your preferred routing controllers, take the MIDI output from our Arduino control surface, and connect it into jack_mixer‘s MIDI input. In jack_mixer, add a new input channel, and give it an appropriate name. Let’s call it “tape deck”, since I have a USB tape deck I’m testing this with. Now the controller magic kicks in: hit the “learn” button for the volume control, and wiggle the first fader on that controller. Then follow with the mute button, and save the new channel. We’ll want to add an output channel, too. Feel free to assign one of your faders to this one, too.

And finally, back to the routing program, and connect your tape deck’s output to jack_mixer input, and route jack_mixer‘s output to your speakers. Play a tape, and enjoy the full control you have over volume and muting! Want to add a Youtube video to the mix? Start the video playing, and just use the routing controller to disconnect it from your speakers, and feed it into a second channel on jack_mixer. Repeat with each of those five cheap and nasty sound cards. Profit!

You Want More?

There’s one more application to mention here. Instead of using jack_mixer, we can use Ardour to do the heavy lifting. To set it up this way, there are two primary Ardour settings, found under preferences: Under the monitoring tab make sure “Record monitoring handled by” is set to Ardour, and the “auto Input does talkback” option is checked. Then add your tracks, set the track input to the appropriate input hardware, and the track output to the master bus. Make sure the master bus is routed to where you want it, and you should be able to live mix with Ardour, too.

This gives you all sorts of goodies to play with, in the form of plugins. Want a compressor or EQ on a sound source? No problem. Want to autotune a source? X42 has a plugin that does that. And of course, Ardour brings recording, looping, and all sorts of other options to the party.

Ardour supports our custom mixing interface, too. Also under preferences, look for the Control Surfaces tab, and make sure General MIDI is checked. Then highlight that and click the “Show Protocol Settings” button. Incoming MIDI should be set to our Arduino device. You can then use the Ctrl + Middle Click shortcut on the channel faders and mute buttons, to put them in learn mode. Wiggle a control to assign it to that task. Or alternatively you can add a .map file to Ardour’s midi_maps directory. Mine looks like this:

 
<?xml version="1.0" encoding="UTF-8"?>
<ArdourMIDIBindings version="1.1.0" name="Arduino Mapping">
  <Binding channel="1" ctl="16" uri="/route/mute B1"/>
  <Binding channel="1" ctl="70" uri="/route/gain B1"/>
  <Binding channel="1" ctl="17" uri="/route/mute B2"/>
  <Binding channel="1" ctl="71" uri="/route/gain B2"/>
  <Binding channel="1" ctl="18" uri="/route/mute B3"/>
  <Binding channel="1" ctl="72" uri="/route/gain B3"/>
  <Binding channel="1" ctl="19" uri="/route/mute B4"/>
  <Binding channel="1" ctl="73" uri="/route/gain B4"/>
  <Binding channel="1" ctl="80" uri="/route/mute B5"/>
  <Binding channel="1" ctl="74" uri="/route/gain B5"/>
  <Binding channel="1" ctl="81" uri="/route/mute B6"/>
  <Binding channel="1" ctl="75" uri="/route/gain B6"/>
</ArdourMIDIBindings>

The Caveats

Now before you get too excited, and go sink a bunch of money and/or time into a Linux audio setup, there are some things you should know. First is latency. It’s really challenging to get a Pipewire system set up to achieve really low latency, particularly when you’re using USB-based hardware. It’s possible, and work is ongoing on the topic. But so far the best I’ve managed to run stable is a 22 millisecond round-trip measurement — and that took a lot of fiddling with the Pipewire config files to avoid garbled audio. That’s just about usable for self monitoring and live music, and for playing anything pre-recorded, that’s perfectly fine.

The second thing to know is that this was awesome. It’s a bit concerning how much fun it is to combine some decent audio hardware with the amazing free tools that are available. Want to auto-tune your voice for your next Zoom meeting? Easy. Build a tiny MIDI keyboard into your desk? Just a microcontroller and some soldering away. The sky’s the limit. And the future is bright, too. Tools like Pipewire and Ardour are under very active development, and the realtime kernel patches are just about to make it over the finish line. Go nuts, create cool stuff, and then be sure to tell us about it!

35 thoughts on “How To Build Jenny’s Budget Mixing Desk

  1. >That’s just about usable for self monitoring and live music, and for playing anything pre-recorded, that’s perfectly fine.

    Mind, there are other sources of latency. If you don’t want to be tethered to your desk and chair by headphone cables, the lowest latency wireless you can get is about 16 ms – so you really need your sound stack to be low latency. ASIO can get you down to 2-6 ms so you can get wireless audio and still keep below 20 ms.

    Old style analog FM headphones were virtually zero latency and didn’t suffer from stuttering buffers – and had superior battery life as well – but nobody makes them anymore. Another victim of pointless digitalization.

    1. Old style analog headphones would likely act as jammers for whichever frequency band they operate in nowadays. I remember a couple instances where that was causing problems. Mostly in the 868MHz and 2.4GHz bands.Old style analog radio technology was often wasteful and merciless to other users of these unlicensed multi purpose bands.

      1. Yep they did. And picked up other close frequencies as well as it was analog and not encrypted. There’s a scene in spinal tap that demonstrates this very well. I liked my old FM headphones for recording (I can’t use Bluetooth ones) but they weren’t ideal by a long shot.

        1. A slight hiss and the occasional interference was a non-issue for the main point. In fact, it was possible to hear two transmitters at once and I did use that “feature” once or twice. I had one transmitter for the PC and another for the TV, and if you tune it just right you can hear both.

          Another plus was the fact that multiple people could hear the same transmitter, which is not possible with Bluetooth and hasn’t been implemented in any of the other digital systems. You can’t get second headphones for your TV for example – but with FM you can do as many as you like. Have a silent disco.

          1. There are some multi headed BT audio options out there, but yes as far as I know it is not possible to set up as many receivers as practical in the space/radius of transmission the way it is with analogue RF. I think all of the BT implementations may be limited to exactly one extra headset.

          2. TVs don’t support multiple BT headphones out of the box. What you have to do is take the S/PDIF output (no analog outputs anymore), convert it to analog, then add multiple BT converters to however many headphones you need.

            The multiple conversions add latency, which ends up in the range of 100 ms and creates a dubbed movie effect where people’s lips move before you hear the sound.

          3. > I think all of the BT implementations may be limited to exactly one extra headset.

            They work by using an entire other BT interface – for each extra headphone you want to connect to. Trouble is, they share the bandwidth by sending the same data twice, for each device, so it’s not really feasible to add any more without running out. It’s a really stupid scheme. There is no real multicast for BT or any of the other digital transmission interfaces.

      2. Depending if you were using the broadcast band like the car stereo adapters do or rolling your own narrower fm stereo version, then you wouldn’t be using that much bandwidth, it’s just that other users assume they’re the only ones that matter even when they’re supposed to share. So if you need a contiguous 20MHz block for wifi and the analog fm users are scattered through the band, one might not be available. But for actual spectrum usage of audio things, Bluetooth channels are supposed to be 1MHz for regular or 2MHz for low energy, although they use adaptive hopping, so it’s spread across a wider spectrum over time and it avoids other users. They do have enough peak bitrate to move more than just a single stereo audio channel, of course. And there’s certainly other advantages too.

        1. The channel allocation for headphones in the US is 915.5 MHz, 916.0 MHz, 916.5 MHz. In the EU it’s supposed to be 863–865 MHz but otherwise similar. There are other bands available and legal to use license-free, but these are the most common for consumer devices.

          You basically have three channels and about 500 kHz bandwidth each. If what you’re saying is correct, then it’s ironic that the old analog system used less bandwidth for better effect.

      1. The problem is the latency varies a lot as packets may need to be re-transmitted on loss. If your average latency is 4 ms, your minimum may be 2 ms and maximum 20 ms, which means you have to add buffers, and that adds latency anyways.

  2. I had a go at something similar for mixing S/PDIF sources asynchronously, initially on a Raspberry Pi 4 and then on a Desktop PC. I got it working well enough for my needs and was envisioning repurposing cheap old firewire gear to add analogue inputs. This is project is on hiatus, but I’ve shared my experience so far here : https://www.vogons.org/viewtopic.php?f=62&t=74902

    This includes audio quality audio loopback tests for distortion, frequency response and latency as well as various hardware trial and errors .

    1. When I started on that article, I thought that it would be an analog mixer, like the ones I built almost 40 years ago. Ah well, naturally not, that works have been so 1980s.

      Of course, now that everything has been digital for the last two decades, even notorized Penny & Giles faders seem, like, old-fashioned.

      They are probably still the hottest shit on earth.

  3. Oh, and assuming my methodology for calculating latency makes sense, I managed to get it down to less than 12 milliseconds on a Pi 4 (with USB 1.1 audio inputs and a USB 2.0 audio output) :
    https://www.vogons.org/viewtopic.php?p=951015#p951015

    And I got below 10 milliseconds on an x86-64 host using PCI audio cards as inputs and a USB 2.0 audio output without trying to push things to the limit :
    https://www.vogons.org/viewtopic.php?p=958750#p958750

    More tweaking might allow better results .

  4. VoiceMeeter. It comes in three flavors of the number of I/O channels and is “donationware”. There’s a minor learning curve, but nothing that a person familiar with audio equipment and basic computer I/O can’t handle.

  5. “Behringer U-PHORIA” => Behringer UMC404HD

    Solved a problem I’ve had for a while. I’ve been looking for a sound card with at least four ADC channels for a while. I’d found some, but they cost way too much. The Behringer thing costs more than I’d like, but it is only “grit your teeth and pay” rather than “nope, no way.”

    I’ve got one on order. We’ll see what comes of it.

    Sonar imaging, here we come!

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.