Ask Hackaday: The Ten Dollar Digital Mixing Desk?

There comes a point in every engineer’s life at which they need a mixing desk, and for me that point is now. But the marketplace for a cheap small mixer just ain’t what it used to be. Where once there were bedroom musicians with a four-track cassette recorder if they were lucky, now everything’s on the computer. Lay down as many tracks as you like, edit and post-process them digitally without much need for a physical mixer, isn’t it great to be living in the future!

This means that those bedroom musicians no longer need cheap mixers, so the models I was looking for have disappeared. In their place are models aimed at podcasters and DJs. If I want a bunch of silly digital effects or a two-channel desk with a crossfader I can fill my boots, but for a conventional mixer I have to look somewhat upmarket. Around the three figure mark are several models, but I am both a cheapskate and an engineer. Surely I can come up with an alternative.

Cheap And Nasty Sound Cards To The Rescue!

An analogue mixer is an extremely simple device at heart, it simply sums a series of audio signals each of which has its own volume control fader. It’s so simple that one can be made with passive components only, and indeed there are extremely affordable mixers that do just that.

They claim this thing has a TI PCM2902 chip inside, and who am I to dispute that!

Most small mixers however use straightforward op-amp gain stages and buffers, with adjustable ones for each channel. It’s possible to make one without too much bother, and indeed I considered exactly that. The problem was that the budget climbs with each successive channel towards the point at which I’d be better off spending a bit more and buying one. I’m not pricing for the most expensive faders on the market, but a reasonable quality linear potentiometer adds quite a bit per channel to the BoM.

At this point it occurred to me, can I use the PC as a live mixer with multiple sound cards? I can order a heap of very cheap and nasty USB sound cards for under ten dollars, so it won’t cost me much to try. I placed the order, and when they arrived I plugged them in and instantly had a computer with five audio jacks.

Unfortunately I can’t just fire up Audacity expecting an awesome multi-channel experience. I have a load of sound cards to choose from, but I can only record from one of them at any one time. It’s time for a dive into Linux audio, to a level I’ve never needed to do before because, well, it’s always just worked, hasn’t it?

Who Knew There Was So Much To Linux Audio!

A screenshot of Alsa Mixer, showing a list of sound cards
So near and yet so far, I can see them but not touch them!

In the beginning, there was the Open Sound System, or OSS. My Linux in the 1990s was all about setting up web servers, so the first Linux sound subsystem passed me by. Instead like probably most of you, I’m used to ALSA, the Advanced Linux Sound System. This sits at kernel level and provides an interface to the disparate pieces of sound hardware there may be connected to the system. On top of that lie sound servers providing a further interface layer such as PulseAudio or Jack, and in many distributions the whole lot has been replaced by PipeWire.

All these promise mixing and multiple card support as their killer feature, so somewhere in that lot it should be possible to find what I want, right? Unfortunately not, because while they can all see a load of soundcards, none of the various machine configurations I tried could make applications see more than one of them at once. Perhaps a solution could be found in binding several cards together as a virtual ALSA card. But here yet again there’s no reward, because as the instructions point out, the real hardware will drift out of sync over time. I wonder whether my live mixer application would find this less problematic than a simultaneous multi-track recorder, but something tells me if it did, everybody would be doing it.

So I’ve conspicuously failed to make a cheap live mixing desk out of a thousand-dollar laptop and ten dollars’ worth of cheap sound cards. Plenty of you will be no doubt be queueing up to berate me for my less-than-1337 level of Linux wizardry, but the truth is I’ve never really concerned myself with the multimedia features before. I’m still curious though, can this be done? Answer me below in the comments!

110 thoughts on “Ask Hackaday: The Ten Dollar Digital Mixing Desk?

  1. >it’s always just worked, hasn’t it?

    No. You’ve just never really done anything with it. Basic stereo with no EQ is “fine”. Anything beyond that goes hairy really quickly.

    1. Which reminds me; has anyone else noticed the fact that audio equipment or the audio mixing these days tends to emphasize a sort of upper bass to mid-range between 300-500 Hz to the point that it sounds like people are speaking through a plastic cone?

      I’ve taken hearing tests and there’s nothing too much off with my ears, it’s just that all reasonably priced audio equipment and music sounds unnatural and terrible until I put a slight notch filter on it somewhere in the neighborhood of 500 Hz. That’s why I refuse to live without a global equalizer for my sound card.

      My theory is that people are mixing for 2.1 and 5.1 setups with only a sub and some tiny tweeters, so there’s no mid-range and they’re cranking it up to compensate. Only problem is, it doesn’t make it sound any better.

      1. All the recent audio gear I’ve had has been fine, perhaps not quite as even and crisp in reproduction across all of human hearing as the older Surround Sound amp I daily drive. But then that unfair comparison anyway – comparing portable devices and cheaper sound stuff with something that is much larger and more high quality even if it is older- the new stuff should lose. I’ve no doubt you could have crappy gear though, some must exist.

        The mixing certainly can be problematic though – if your source only has stereo you should be golden really. As that will upmix to however many speakers you have (and down mix to mono) and sound just fine (assuming your soundcard/amp/speakers are actually set up right). But I have had stuff where the file only has one variety of Dolby Digital encoded into it for a particular speaker layout and version of Pro-Logic/DTS whatever, that tends to be more problematic if you don’t have a matching sink. And can create the effects you describe – the voice primary mixed into a channel that doesn’t exist and getting mixed into your other speakers not at all or insufficiently well, probably only pushed to the tweeter and not the sub at all.

        That issue is starting to get me considering a real upgrade to the primary sound system as while my old amp sounds great in surround, even though it only supports the last generations of lossy dolly surround not the lossless it isn’t as easily compatible as it used to be. Both in I/O and formats it can play.

        1. What I mean is, people no longer own “stereos” with full range speakers. Instead they have a big sub and a bunch of smaller satellites, or “sound bars” and other DSP-enhanced junk that do the missing fundamental trick to fake having lower frequencies. It’s got nothing to do with dolby surround or any of that. Just plain stereo music sounds bad.

          I think people no longer have mid range speakers, so the producers are trying to compensate.

          1. And I’m saying that isn’t my experience as everything I’ve played with has a pretty good range of low all the way through to high frequency response. You don’t need the giant bookcase speaker to get a good sound anymore. Heck even tiny little things like the Steamdeck and some of the thin and light laptop crowd can produce a pretty full sound despite being tiny speakers thrown into a tiny space as almost an afterthought, obviously its not as good as a real sound system then, won’t go as loud without distortion etc but some of them at least are really quite respectable.

            And something like a soundbar should have no trouble, those things are pretty darn big really. If you are finding a system can’t reproduce stereo properly then it almost certainly has a flaw in the configuration somewhere creating the problem rather than the system won’t play it just fine once set up, at least from my experience with modern cheaper stuff where it does sound just fine.

          2. People own headphones/earphones and use them with smartphones and laptops. I tend to think that’s a bigger part of what things are mixed for than actual whole-room speakers. Though I don’t know that from experience.

          3. >can produce a pretty full sound despite being tiny speakers thrown into a tiny space as almost an afterthought

            That’s a psychoacoustic effect. Same as with the sound bars – they’re small speakers stuffed in a big box to make it seem more impressive and the sound is done by DSP.

            https://en.wikipedia.org/wiki/Missing_fundamental

            >”One example of a popular song that was recorded with MaxxBass processing is “Lady Marmalade”, the 2001 Grammy award-winning version sung by Christina Aguilera, Lil’ Kim, Mýa, and Pink, produced by Missy Elliott.”

            They bake that processing in at the recording stage as well.

          4. >a soundbar should have no trouble, those things are pretty darn big really

            The typical soundbar has tiny 1-2″ elements stuck in a long resonant tube to do both the duty of a “subwoofer” and a tweeter. They rely on the resonant cavity to produce any bass at all.

            A typical mid-range speaker element is something around 5-8″ and these used to be the main speakers in a stereo setup without a subwoofer, paired with a tweeter. They reproduce sounds in the 250 Hz to 2000 Hz range. Now with the mid-ranges gone and replaced by just tweeters and subs, or tweeters pretending to be subs with a long resonant tube, there’s a hole right there in the 250 Hz to 1 kHz range where there’s not a whole lot of sound power coming out of the speaker systems.

          5. You won’t get the full on lowest frequency base without a sub on a sound bar, but they are very capable on the mid range, nice big resonance chambers and nobody serious suggests you should do with a sub do they?

            As for processing being done at the recording stage, never seen any evidence of it. But then I don’t listen to Pop, and nothing I’ve listened to has been so ruined.

            And if it is done well it doesn’t matter if its generated by processing or not, it sounds right – and as folks sensitivity to frequency changes by volume there is generally a processing on all types of sound systems to match that change anyway.

          6. Oh also the performance of smaller speaker drivers is so much better than it used to be as well – those 1-2″ may well be able to function better than your old 4-8″ across the frequency range, maybe even comparable to even bigger cones. For example when I first got this surround sound setup I put the ‘tweeter’ on my dads amp to get a feel for it, and compared to the giant torso sized speakers he had then they did lack the lowest frequency, but really not badly. Not long later dumped those giants for speakers more shoebox size and find they sound better across the board with the same everything else…

          7. >capable on the mid range, nice big resonance chambers

            Resonance chambers are no good for transients, because by definition they operate by building up energy over multiple wave cycles. They distort and muddle the sound, and they “color” it by their internal reflections etc. which gives the speaker a characteristic tone (plastic, tinny, etc.). I would not call them capable at all – merely able to fake a mid-range response.

          8. Heck if you are that worried about any colour to the sound you better have a anechoic chamber to listen in as well – otherwise the room will be changing the sound a fair bit too! There is always some “colour” to a sound in any listening setup…

            The question is not if it exists, only if its really really bad or really unsuitable to reproduce the intended sounds being played. And nothing I’ve happened to use has been bad, most of its really quite good actually…

          9. Seems a longer comment got lost somewhere, but in short Dude everything colours the sound reproduction – all speakers have resonances AND unless your listening in an anechoic chamber the room itself makes up a major factor to how audio sounds… There is no getting away from that!

          10. I hate to be the devil’s advocate, bit it’s The Dude who is actually right here. If someone is saying that the size/radius of the cone doesn’t matter or that old speakers have bad sound, they neither understand acoustics not how waves are emitted and propagated, and also they probably just had an elephant stomp on their ear as a kid.

            Foldi-one, read some articles or books about it before going as nauseam with this BS.

          11. @Sqvaard There is an order or magnitude or more difference in saying the new stuff sounds good enough across the spectrum, or that the smaller cones of today generally have better response than their historic counterparts (which is simple to prove just look at the response curves to similar size drivers from ye olden days and today) and saying size doesn’t make a difference at all!!!!!

            At no point have I even tried to claim bigger speakers are bad in some way! Just that modern smaller speakers are able to be good enough to create a good across the board frequency response! Because they generally at least are!

          12. > no speaker is sufficient for you, as all of them have resonances

            There are no perfect speakers – but that doesn’t mean some of them can’t be worse than others. Using resonance to “fix” the frequency response of tiny speaker elements has some serious tradeoffs.

          13. >which is simple to prove just look at the response curves

            That doesn’t say anything about phase delay and transient response. Since you need to build up energy over several wave cycles to excite your resonance, you’re obviously going to mess with both.

            Phase information is especially important for speech: you can scramble and distort the frequency spectrum all you want, add or remove harmonics, but if you mess up with the phase information it becomes unintelligible. Letters like P, T, V start to sound the same etc.

          14. That last bit I’m talking about the bare driver – no resonant structures to speak of.
            Advances in magnets and materials for the cone mean the spikey nature of the older cones that really only work in a very narrow range with steep falloff outside that is rather flattened. So your tiny cone of today can actually produce frequencies one of many decades past couldn’t even come close to, and has a much more even response across that range BEFORE you put it in a box and deal with resonances!

      2. There was a time where mysic produced in the usa had a predominant high and low as the speakers had a pronounced mud, correcting the hole sound spectrum. Something with Bose or Kef and Toto i recall.

      3. Social media is what drives music marketing in 2023. Smartphones are the primary way users consume social media. Smartphones aren’t good at reproducing low bass because of physics.

      4. I tend to find that since everything is mixed for high volume listening, but I listen at much lower volumes, very low and high frequencies are barely there unless you use something with a v-shaped sound, which will usually get poor reviews from people who listen at higher volumes and find it unpleasantly thumpy or screechy. Plus, common cheap bass-heavy headphones/earphones/speakers do a bad job of conveying sub-bass, so instead they add a ton of muddy mid-upper bass to make up for it. And hearing loss can throw a wrench in it if you have any of that. Loudness compensation used to be nice, but it’s not generally available despite the fact we could probably recalculate it on the fly as the listener changes volume now.
        (See https://en.wikipedia.org/wiki/Equal-loudness_contour)

        1. I have some fairly nice headphones as well that go all the way down to 10 Hz so they just shake your head around. Still the same problem – maybe it’s the loudness compensation since I don’t like to blast my ears either.

    2. I have an old Traynor 604. Paid $35 for it. Six inputs, and you can fake a stereo output with the monitor mix. Best part is it’s got 100 watts (mono) of Traynor goodness if I want it. (Oh, and spring reverb). Put that into a USB interface and you’re 1970s golden. Keepin’ it analogue.

  2. I’ve never tried to do this with Linux, but I did do this a solid 20 years ago with Windows, multiple PCI sound cards, and Cool Edit Pro. Used four sound cards to do 8 channel live reinforcement and/or home studio recording. Worked quite well back then.

    1. This was one of the first things I tried when we went from ISA to PCI and I no longer had to worry about DMA/DIRQ/IRQ/ACK and DACK settings. And yes Cooledit/winamp was the apps i used mostly but also there was a app that came with ESS sound cards that could record and pipe channel as if they were tracks, it ran in dos via windows and to adjust you used the number keys, eg press 1 to increase channel One, shift 1 to decrease and the final mix could to sent to any output at the press of a letter. While each channel was treated separate there was a lot of audio bleed between channels even with good cables/mic etcs that we often just put one mic per stereo pair. Ah them were the ad-hoc days

    2. In Linux (and Windows), I have used Mixxx (mixxx.org) with 4 Griffin iMic USB audio interfaces. You can configure them as AUX, MIC or even turntable inputs. For stereo, go with AUX or turntable. The advantage of using AUX is less screen space. The advantage of turntable is each input would have a dedicate 3-band EQ while leaving the assignable EQs available for other uses. Besides EQ, Mixxx has several other assignable effects

  3. The problem with a live mixer application will be latency. Even if you could somehow combine a load of soundcards into one virtual soundcard, you’d have two significant sources of latency working against you: the latency of USB (which will be exacerbated by hanging all of the soundcards off one USB lane – only one of them will get to talk at a time), and the latency of the resampling it’d require to mix all of the soundcards’ inputs down into a single card’s output. It’s going to be entirely too easy to end up with a noticeable (or worse, delay-scale) lag from input to output.

    And then there will be all the ground loops… USB has a reputation for being *horrible* for clean sound. The folk who make $1000+ USB audio interfaces with kerjillions of audio inputs care about that kind of thing, and go out of their way to ensure that the audio bits are isolated, in power and noise, from the USB part. The makers of $2 USB soundcard keys can’t afford to, and don’t, care.

    As to how one might go about it despite that, JACK supports this kind of thing directly, and so I suspect Pipewire would too – but one card (probably the one you want to hear the output from) will always have to be designated the master, as it’s that one that sets the sample clock; all of the other cards would be resampled and fed into JACK via alsa_in clients, one instance per card. JACK is likely the best hope of minimising latency, too – that being one of its major design goals. But even so, I’d be astonished to find that latency could be reduced to anything approaching usable.

    1. You can set the USB polling rate to 1000 Hz (USB 1.1) up to 8000 Hz (USB 2.0) with the appropriate drivers, which basically eliminates the latency problem. The main issue is Linux + appropriate drivers.

        1. The polling interval for actual device access is some multiple of the minimum transaction interval and it is decided by the operating system and/or your device drivers. HID devices like keyboards and mice are typically polled at 125 Hz (8ms) by default.

          1. Sound cards might use the isochronous transport mode of USB, which basically guarantees polling at a requested frequency. This is done by placing entries at the desired frequency in the isochronous reservation table, which the host chip processes first before serving other types of requests. Iso devices are placed in the table first-come, first served, so once the table fills up, additional devices either can’t be accommodated, or else they only work at lower service frequencies.

            Note that iso mode is only entered once you start streaming. So you might plug in lots of devices just fine, only to find that you can stream from a limited number of them at once. Video cameras are where you usually run into this limitation. I’d imagine that for audio-only devices, the limit is pretty high.

          2. I ran into the polling rate problem because I got an old MIDI keyboard, for which the Windows 7 drivers work just fine, but the Windows 10 drivers don’t because they drop the polling interval down to 8 ms and that causes glitches. The manufacturer took the opportunity to “obsolete” their hardware by not providing proper driver support.

      1. imo one of the biggest stories of the last 20 years of computing has been this trend of throwing more and faster processors and I/O at the problem, in the hope of eliminating latency. and you can see some great success, for example, in terms of the average latency from one frame to the next in a videogame.

        but the end lesson has been really clear for a while now: latency is a huge problem that is almost unmanageable. it’s true that things are so fast that in many cases — even highly-demanding ones — the latency can simply be ignored. and, in fact, that’s how i usually like to think about it. but it has become almost impossible to put upper limits on the latency, or to guarantee synchronicity or repeatability. there are just so many different interactions between totally asynchronous/buffered processes at different levels. it has become really hard to reason about.

        so i think USB is fast enough that a lot of users will simply assume it is synchronous and never ask the question and always be satisfied. but there is no “fast enough” where USB audio timing issues disappear. anyone who has specific quantified standards will find that USB is basically unable to meet them, or can only meet them in narrow circumstances with a lot of effort.

        in my experience, working with digital audio stacks, i can easily tune things to where i might be able to say “that delay does not disorient me while i’m playing.” but “that delay is effectively 0” is absolutely unattainable and even “that delay is small and consistent throughout my session” is a stretch.

        1. Indeed, many ‘hard real time’ tasks that used to be really really hard to get right are often now just throw the hz at it, as even cheap embedded compute can be really quite speedy.

    2. USB audio devices use isochronous transfers, so the number of devices shouldn’t affect latency. If the host can’t guarantee the data requirements of the device, it’ll simply refuse to configure it.

      Clock syncing and audio quality will be problems with cheap cards. Both require some degree of expertise to do right, and I wouldn’t expect either from any Aliexpress cheapies.

  4. Check craigslist. I see lots of mixer boards going for cheap here in NH, probably much more common in more populated areas. There’s 2 in my local area right now.

    Probably easier and more intuitive than working with linux.

    People want to become musicians as kids, then decide on a different path and want to sell their used mixers. We even got a half dozen of them donated to the makerspace.

    1. ^ this, I cannot seriously believe the writer’s claim that there’s no passable mixers out there for less money than trying to build your own.

      I bought a big 8-channel powered one with speakers and everything for about $150 fairly recently, and it was by no means the only option on eBay at the time.

          1. In the UK tech tends to be relatively expensive compared to our earnings when new, so even the second hand market isn’t really cheap. Used to be even when the £1 was worth around $2 that the price paid for something in the UK was pretty much just replace the $ with a £ – so effectively twice as expensive, and that is without considering the relative earnings to cost of living…

  5. sort of a solution,
    Add an array of Teensies, digitize the audio data, collect all the few khz bandwidth data (over HID?) and may be use the PC or a raspberry pi or similar to do a data mixer…
    SOCs have come to fairly powerful in terms of cost, processing power and power consumption.

  6. One problem with this solution of multiple sound cards is that the sample clocks of the cards will be slightly different and out of phase with each other. Trying to mix the result will not turn out well. At the very least, they should be modified so they can all run from a single master clock.

    1. This was my first reaction, as well, but I bet these beasts derive their clock from the USB clock, or rather, they have PLL that makes sure the internal USB-side oscillator doesn’t run out of sync with the USB host; so why not derive the sample clock from that.

      1. No, USB doesn’t deliver a clock signal itself. You need a good quality 12MHz clock on the device side already to talk successfully to a host, so these devices always have a crystal on them.

        1. There’s plenty of USB devices that don’t require crystals, they use the (nominally) 1ms USB start-of-frame to tune their internal oscillator. From memory, the USB audio class spec also includes modes where the audio codec synchronizes to the host clock using USB frames.

          Texas Instruments at least used to make a range of microcontrollers specifically for streaming audio over USB, which had dedicated codec PLLs you could sync to the device’s own crystal, the USB host, or an external audio device.

  7. As it happens, I work with TI PCM290x hardware for my job, and I’ve had some success recording multiple channels simultaneously in Windows using N-Track Studio. That software is not free but there is a free trial and the pricing is very reasonable. Unfortunately there’s no Linux version as far as I know, but there are versions for Windows, Mac, Android and iOS.

    (I don’t have any business affiliation with N-Track, I’m just a satisfied very occasional user)

    === Jac

    1. My first thought was to use VB audio Voicemeeter (not a spelling mistake) – that bit of software allows you to combine several different sources into one stream that can then be recorded in Audacity or other. Soundflower for Mac is similar.

  8. A problem you would run into is keeping a bunch of these in sync, or even getting them synced up to begin with. Even with multiple channels on a single sound card you can get sample drift from one channel to the next. I made a 64-track recorder about a decade ago, and I was seeing the channels out of sync by a few samples and seemingly random offsets even when they were all fed from the same digital source.
    On Windows and macos there are a few solutions for combining multiple audio devices into one (macos has a built-in tool for this in the audio midi setup application), and on linux you have things like pulse audio and jack.

  9. If you want to do things the cheap and dirty way – without ordering the right equipment to start with – what we want is a weird, junky alternative to buying linear pots. If changing resistance with a slider is out, what’s next? Well, a transformer with a variable ratio will vary the output. It needs to be one that has a high coupling coefficient and a low inductance so it won’t have a high dependence on frequency as long as the core material itself can keep up with it. (so maybe avoid laminated steel 60hz power transformers, and the load probably needs to be dominantly resistive). And to do it with junk instead of buying things means finding or making the core, which is probably best chosen as a solenoid (cylinder) as it’s easier to make things move in straight lines that way. You might be able to reel and unreel from a spool with a completely closed magnetic loop, but inserting a rod into a couple of unmoving coils of different lengths would be a lot easier even though the magnetic field goes everywhere. No idea if you could ever get the coupling to work well enough, but 2x the turns would be 2x the voltage in the ideal case, so you could probably make up for any nonlinearity across length by adjusting the spacing of the turns along the length. In order to get the rods, though, you’d probably need to grind up the ferrite beads off of a bajillion old cables and pack the powder into some straws or something. But hey, at least you didn’t need to buy anything made for the purpose!

    If anyone ever actually does the above, I want to see it XD

    1. >If changing resistance with a slider is out, what’s next?

      Converting a regular rotary pot to a linear pot by running loops of fishing line around the shaft. Tug on the fishing line, the pot turns.

  10. I have a mixing board, which isn’t that hard to come by (mine is made by ammoon and a search for “ammoon mixer” on amazon turns up tons of small mixing boards available for quite reasonable, at least in the US). As for on-PC mixing, I’ve dabbled with PulseAudio and Jack and had a bit of luck, but it was a huge pain. What I primarily use is “VoiceMeeter” in Windows. It is not free or opensource (though they do have reduced capability version for free and it is extremely cheap). It works very well for me, and allows me to mix lots of audio streams on the fly, though there is a bit of a learning curve.

  11. You could use Pipewire to make this work with your sketchy hardware. Been there, done that. There are even apps designed for this purpose, like jack_mixer or even Ardour. Remember, Pipewire implemented a Jack server.

    For hardware, I have a Behringer U-phoria UMC1820, which has 8 pre-amps built-in. There are cheaper/more expensive options, depending on what you need.

  12. I remember getting a “pirated” copy of Gold Wave off mIRC in 2002-ish? I poked around with it for a year trying to make my own mixes without any hardware or quality sound files to start with. We were still using P2P file sharing clients at the time since BitTorrent wasn’t as main stream yet. Getting studio quality multi-channel tracks was like pulling teeth on IRC. You had to know someone who knew someone who could invite you. Granted that was when the RIAA was suing preteens for file sharing so I can understand the paranoia.
    Anyway, after a year of making horrible music and realizing just how much calculus (never mind quality equipment) was involved in good music I gave up and left it to actual artists and audio engineers. I take my hat off to anyone who can make and mix good music, regardless of the hardware they employ.

  13. I’ve tried doing this on Linux, and it’s janky at best. On Windows it feels like a hack and has way to much latency. On Mac, it’s a built in function that just works, with surprisingly good latency, automatic resampling, and amazing reliability. I’ve considered for a while writing a Linux m multi-device combo driver, but I’m not sure I could do better than Jack, although making it work at the ALSA/whatever level might be more reliable.

  14. If what you want to do is just mix and record multiple tracks, I just did this on Windows. You don’t actually need physical audio hardware that supports more than 2 tracks. I used Mixxx to mix two songs together, but I wanted to output all 4 channels instead of just the two mixed ones. Mixxx allows you to output the 2 mixed channels to the speakers, but you can also direct the output of each “deck” to another audio sink. The key is using WASAPI and Virtual Audio Cable. I created a 4 channel device with VAC, and then I could assign channels 1-2 to the output of deck 1, and channels 3-4 to deck 2. (I tried 6 channels with a 3rd deck, but the audio started to bleed through.) Then there’s a control panel setting you have to change to support more than 2 channels, but after you do that you can select the 4-channel device and record them directly in Audacity.

  15. Could the sync issue be solved by removing the clock crystals from all the soundcards except for one, and driving them all from that master, perhaps with a buffer/amplifier? Same trick people do to get coherent sampling from a bunch of RTLSDR’s…

  16. I have used those sound cards as poor man’s DAQs by Setting custom dev rules (DEVPATH==”…”, ATTR{id}=”….”), and then using GNU Radio to uniquely identify and access each card (audio.source(samp_rate, “plughw:attr_id”, True)).

    It did have some sync issues, though. GNU Radio also may introduce noticeable latency to the stream.

  17. I’ve been looking for a good idea to do a personal fpga project. This might be just the thing. I’ve also been wanting to do some more with the ice40 and opensource tool chain, and someone has already made an ice40 usb soundcard design. I think I’ve got a couple pmod adc’s laying around as well.

    1. This would be useful if connecting pairs of mics, to a pair of mic inputs. Or does it do 8 to mono? Not USB capture which is the intent here, mixing comes later in the digital domain. This box would be good for 4 keyboards in stereo output to go a single mixer input where the musician does the keyboard’s mix pre-mix.

    1. Yep, this or qjackctl is half the solution. But for actual mixing, you need something in the middle to control volumes. Ardour can do it, and there’s jack_mixer for a lighter solution. For added fun, get a midi control interface, and link the physical faders to your software’s channel controls.

  18. Depends on your use case really. If you need live monitoring of the mixed output… just give up. The round trip of just the USB protocol introduces nearly 20ms which is about my limit for being able to play along comfortably. That’s with 0 buffer. As soon as you buffer the output, latency blasts through the roof. And you’re gonna hafta buffer because you’re resampling all your inputs because they aren’t clocksynced.

    However, if you’re only recording (without live monitoring) and then playing back the result while adjusting the levels – in other words: ‘mixing’ – latency and clock sync doesn’t matter at all (other than a very slight delay when you click play).

    Prolly the easy solution here is a eBay presonus or behringer USB audio interface. They skip the linear pots on their BOM and are very reasonably priced.

  19. 4 cheap sounscards will sound like 4 cheap soundcards or interfaces. Also latency might be an issue . Also with a card or interface that cheap your input conversion will very likely not be too good either. So I’ve been engineering for over 2 years and in theory it might sound like a good idea and if you use a number of better interfaces might be a fun project , I no way would I use 10 dollar cards and expect results or no extreme latency .

  20. My personal suggestion is to look at a Zoom recorder. If you’re a musician look at the Livetrack 8 as it will give you live mixer board with a multitrack digital recording system as a free add on. The modern analog of the 4 track cassette Tascam and others sold.

    For larger ensembles there are plenty of options. The H4N is also very good and actually what I’ve used the most. Great for recording a jam session or performance in stereo. I’ve become a big proponent of taking it as it comes. Which makes th H1N ideal as it will fit in a pocket with a mini-tripod my go to now for casual use.

    There are lots of other choices. This is just what I chose.

    1. the zoom h4 is a great product.
      I only wish that you could use the two channels occupied by the built in XY mic as analog inputs. that would make it one of the cooler devices of all time.

    1. As a little expansion on my comment:
      I not too long ago heard that a percentage of people can’t use VR because their brain simply doesn’t get depth from the parallax of their eyes, even though their eyes are fine.
      Since hearing that I’ve been wondering if there is a percentage of people with the same issue regarding stereo sound. And if perhaps that is why there are quite a few people who are fine with mono sound. I wonder if anybody ever bothered to investigate that.

      I would think that since this is so far unknown that if it exists that would also explain why there are so many pretty expensive mono speakers released, perhaps the people running those companies simply can’t understand why not.

      1. Well if you really want to mix many input sources mono is likely what you actually want on input anyway – any 3d separation you may choose to do would want to be deliberate in the final mix I’d think. So for this the only gain stereo inputs would get you is extra effective channels (assuming you can set up your sound devices into your software of choice correctly).

  21. I have a solution I’ve used in production, by making use of PipeWire, and GStreamer.

    I’ve even got effects, and you control the whole thing via the web, plus automatic feedback detection shutoff to compensate for the lack of physical controls, and resilience against disconnected and then reconnected soundcards.

    There’s also recording, plus soundboard support with the included Chandler module, and you can visually program sounds to play in random patterns for background soundscapes.

    Latency seems to heavily depend on hardware though, I never touch the audio signal directly, so I can only do as well as PipeWire+GStreamer itself, I’m pretty much just doing UI and state management.

    https://github.com/EternityForest/KaithemAutomation

  22. @hearmoretunes I hear Google now released their own ChatGPT type of thing, which they say is better.
    I think the type of questions like you just asked, and which we all sometimes ask, might actually lead to results on those platforms.. maybe.. possibly..

    1. Addendum: Oh that release of google is only to a test adience. It’s called ‘bard’ it seems.

      Anyway soon google will have their own AJT..

      (AJT of course the acronym for Artificial Jive Turkey.)

  23. Seems to me that OBS allows for multiple sound sources via USB. and allows you to put vst plug-ins on them…standard obs plugins “filters” include eq and liniting etc. Obs will record mono or stereo…and even builds a “mixer” for you as you add audio inputs.

    since obs is the joining point…sync from the various sources should not be an issue… real time monitoring would be the issue. and I have no idea if obs can record separate tracks separately…. but for programmers obs is open source…. and maybe could be redone to record multiple tracks as separatel tracks.

  24. 1. any old Analog desk or mixer with Chanel outs.
    2. your array of usb soundboard dongles. plugged into a many different usb plugs a your computer might offer.
    2b. plug the channel outs into the various USB sound cards
    3. download Reaper as your free DAW.. in Reaper you choose sources for each channel before recording. once you figure that all out save a recording template so you only do this once.
    4. you’d have live latency free monitoring via the mixers headphone jack… or main outs.

    subsequent recorded tracks might be synced using reapers latency adjustment.

    seems to me you’d have a very cheap multichannel digital recorder. and you’d do your final mixing and sweetening etc in Reaper itself.

  25. My two cents: even if it worked on the software side you would probably be fighting with audio quality due to the congested USB bus, especially if you connect all the cards on the same hub like you show in the photo. I do not know the details of USB bus, but my past experience is that USB peripherals that stream significant amount of data generally do not like each other. Also USB hubs have a tendency to reduce the available throughput.

  26. Actually Pulseaudio/PipeWire is extremely powerful. But only if you study its config files and modules.
    My own setup combines multiple sound cards into a surround sound system with latency adjustment for each individiual channel.
    And all this also works over the network, so any other Linux PC can stream audio to the System with the speaker setup.

    It’s quite remarkable what’s possible and while I can’t tell you the solution for your problem, I’m sure it exists.

    1. could you share more detailed info on your setup? any guides to follow? I was looking into doing this with a mini pc and some usb dongles I have around for a surround sound system or a software based dsp crossover for stereo audio. sound like your setup would be awesome help.

  27. Latency will be a huge problem with a setup like this. For a couple hundred you can get a complete multitrack recorder, one that can even use batteries if you want to be more mobile. You’d still find a lot of utility with the computer for engineering and mastering the tracks as the recorders don’t have the same features as a DAW on a computer. That said, the computer would have trouble recording 8 simultaneous tracks without latency issues between sound interfaces.

  28. So my question is what does the hardware support? We peaked in the 98se-Win7 era with Drivers that would actually expose all the functionality.

    What I mean is many sound cards(IE onboard Realtek Codecs) have 7 or so input output channels and you can mix them. I used to do it. Nothing more irritating than a computer with a very noisy output and you know its because the unused CD Audio input is floating and unmuted. Yet the drivers don’t even show it as existing, let alone unmuted into the stereo mix.

    Realtek used to let you download their reference drivers and get around that sort of thing.

    Interesting problem you have, can’t say I’ve run into it.

    1. That sounds like a very convincing argument to not use Windows then, no wonder musical folks tend to pay the Apple tax… Though as far as I know every channel of every chipset is available for control with ALSA on Linux (assuming the device is properly supported and identified all the channels will even be labelled correctly with what they are wired up to in this specific case, but I’ve got a laptop where the mixers are all a bit muddled – never bothered to even look at a fix it as I know)

  29. There are lots of digital rack mixers for less than $300, which was near the cost of a new TASCAM PortaStudio back in the day. A bunch of cheap sound cards sounds like a way to have cheap sounding audio–bad SNR, low headroom (USB bus power is 5V), etc. Just spring for a Behringer or other rack mixer.

  30. Depending on your CPU and RAM capacity, you could try something like that on Windows using a bit of temperamental software, significant patience and tinkering, and if you decide to use a Desktop PC and not a significantly performance limited Laptop, you may end up with far better results. I’ve used this setup myself to broadcast music in VR.
    I highly recommend ASUS PCIe sound cards over anything USB as the USB controller drivers all over the place are pretty latency heavy after the first mix path. I’ve tried Creative cards with disastrous results in either not booting up or constantly BSOD looping. My favorite remark on an Audigy card was “we couldn’t find enough resources to start this device.” Adding in discrete sound cards directly on the PCIe lanes will definitely increase responsiveness and reduce latency. But the software has to be up to snuff, and nothing out there that I’ve tried is. The closest you can get is VBAudio’s Voicemeeter Potato, this developer in the past claimed the software to be Windows 10 compatible, but you can only get it to work in compatibility mode for Windows 8, (he’s even declared that to be why it’s Win 10 compatible) but the only one that’s 64bit is Potato. With Voicemeeter Potato you can mix a handful of physical devices, a few virtual devices, all into several different output physical/virtual devices. The nightmare comes with configuring and setup through days of trial and error.
    And with no real “how to” or what does what exactly, you have to rely on basic computer tech knowledge to understand nearly anything. Almost all the virtual audio apps I’ve tried say “Don’t mess about in the settings, unless you know what you’re doing.” I’d honestly like to meet the poor soul who “knows what they’re doing” but isn’t the developer of the software.

    Oh wait, that’s technically me now…
    If you’re interested in trying it out, sir, please feel free to add me on Discord. Just copy and paste my name: “神 Nafryti#6969” minus the quotes, or you can join my Discord Server via the link. I don’t have the greatest memory, apologies in advance, please don’t be afraid to talk to me and refresh my memory.

  31. A few colleagues and I are trying to figure out how to do the opposite, so how to have a lot of separate outputs (say 10 or 20), on the cheap. Most equipment has limited outputs, no more than say 8 stereo outputs from one source.

    There are soundcards and professionals mixers that will allow more multi-channel outputs, but those require a serious budget — which most art students, which is who we work with, do not have.
    Like you, I find it interesting that this problem can only be solved using expensive equipment — or not at all.

    Fortunately for us, someone has tried using a setup similar to the one you are trying, with a Raspberry Pi and those cheap USB soundcards:
    https://www.esologic.com/multi-audio/

    They are using Python and the sounddevice library to send different audio channels to different soundcards. I have yet to try it out myself, but maybe there is something in that project that may help you to have a lot of inputs?

  32. The way:

    Pipewire

    Wireplumber

    Carla (Kx)

    A Daw (Reaper)

    With that setup anyone can have limitless software wiring options.

    Latency and bus congestion are, of course, unbreakable hardware bounds.

    Oh, check out jack-iodelay to measure latency across your wirings.

    Pipewire stands for sound as Xorg stand and stood for video

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.