The Four Thousand Dollar MP3 Player

[Pat]’s friend got a Pono for Christmas, a digital audio player that prides itself on having the highest fidelity of any music player. It’s a digital audio device designed in hand with [Neil Young], a device that had a six million dollar Kickstarter, and is probably the highest-spec audio device that will be released for the foreseeable future.

The Pono is an interesting device. Where CDs have 16-bit, 44.1 kHz audio, the Pono can play modern lossless formats – up to 24-bit, 192 kHz audio. There will undoubtedly be audiophiles arguing over the merits of higher sampling rates and more bits, but there is one way to make all those arguments moot: building an MP3 player out of an oscilloscope.

Digital audio players are limited by the consumer market; there’s no economical way to put gigasamples per second into a device that will ultimately sell for a few thousand dollars. Oscilloscopes are not built for the consumer market, though, and the ADCs and DACs in a medium-range scope will always be above what a simple audio player can manage.

[Pat] figured the Tektronicx MDO3000 series scope sitting on his bench would be a great way to capture and play music and extremely high bit rates. He recorded a song to memory at a ‘lazy’ 1 Megasample per second through analog channel one. From there, a press of the button made this sample ready for playback (into a cheap, battery-powered speaker, of course).

Of course this entire experiment means nothing. the FLAC format can only handle a sampling rate of up to 655 kilosamples per second. While digital audio formats could theoretically record up to 2.5 Gigasamples per second, the question of ‘why’ would inevitably enter into the minds of audio engineers and anyone with an ounce of sense. Short of recording music from the master tapes or another analog source directly into an oscilloscope, there’s no way to obtain music at this high of a bit rate. It’s just a dumb demonstration, but it is the most expensive MP3 player you can buy.

62 thoughts on “The Four Thousand Dollar MP3 Player

    1. No it is not an mp3 player and you even mention the limitation of the flac format in the body copy. If we are engineers at heart then we should know how important clear and precise language is. Saying “you know what I meant you’re just being a nit picker” doesn’t cut it if we are talking engineering, whether it is code writing or hardware specs, words matter.

      1. This is the latest in a series of what appears to be deliberately controversial writeups. Aimed as click bait… Or to create articles from low value content… Who knows? One thing is sure, there aren’t many engineers on had staff.

      2. Language is also meant to be connotative. If you say “Four thousand dollar Mp3 player” then the image in the readers mind is of a completely over-the-top solution to something very commonplace. This is exactly the point of the article.
        If they said “Four thousand dollar ultra high sample rate audio recording”, the meaning is more precise but the connotation is lost. It would make the article sound like it was a high-tech solution to an application that required very accurate recording and playback, which might be technically accurate but isn’t really the point of the article.
        I understand the point about being clear, but this is an article, not a schematic.

        1. I think the industry term is Digital Audio Player, or Digital Media Player. It’s as bad as calling an HP a Dell or a car a truck or a speedboat a sailboat.

          I think it’s dumb for supposedly technology proficient people to misidentify it so terribly. FLAC player would have been slightly more accurate, as the source article seems to suggest that’s the original file.

  1. Well, the time-resolution may be excellent – but the A/D-converter is stuck at 8bit (maybe 11bit), thus waaay below CD specifications.

    And as the audio connection is direct, there is no oversampling that could fill-up the steps between the voltage steps. With a comparator used as 1-bit-A/D-converter it probably should be possible to build a high-resolition out of the scope – but just using the builtin A/D its poor voltage=volume-resolution cannot be saved by the excellent time-wise resolution.

    1. If I follow your comment correctly, you’re incorrect — that’s exactly what oversampling does. It doesn’t matter if oversampling is using a 1 bit comparator or a 11 bit A/D from a scope, the nature of oversampling is that sample rate can be traded for resolution. If the scope has an 8-bit A/D he’d need a pretty high oversampling rate to get 24 bits though. On the order of an oversampling ratio of 65536 (2^16).

  2. I guess “MP3” is generic for “digital audio player” to our illustrious story writer. It looks like the writer of the original hack did the same thing too. If I am reading it right, the Tek only recorded a wave form that was already decoded.

  3. Fast scopes rarely has analog resolution better than 8-bit. With some fancy digital processing for oversampling, that A/D stage could have the resolution. I am however not so sue about the input amplifier stage. If it is designed for high bandwidth, it would just pick more noise. $4k isn’t that expensive in the world of scopes.

  4. Sampling and bitrate are pointless if the DAC has high noise or distortion, those are the true qualities in audio gear, even “CD quality” will sound much better if the DAC is being used properly VS a cheap chinese “192kHz/24b” audio codec that will let you hear the CPU slacking off (noise) :P

    Also, why is nobody talking about DSD, which has just one bit of resolution but still can deliver the same quality? (given enough bandwith)
    It seem kinda silly to store sound as PCM when most DACs and ADCs for audio use are of the sigma-delta type, so they actually have to convert from what essentially is DSD to PCM…

  5. The hack is impressive.
    Don’t understand the point of a 24-bit ~190KHz sound system, its ridiculously priced and even an expert won’t be able to tell the difference between that and a lesser specced competitor. I doubt they’d be able to tell the difference between that and CD audio tomorrow

    1. It has good reason for professional use, especially when recording – you can afford to sacrifice several of the lower bits to noise and still have several of the upper bits free in case you didn’t set the gain right (no clipping).
      As for consumer use – actually prohibitive, as audio amps and speakers are not designed/built to deal with ultrasound, which can cause all kinds of weird (and audible!) harmonics, so you still have to get rid of anything “extra” that the higher samplerate might capture :P

      1. Having a higher resolution doesn’t change the noise characteristic of the system though, you’d loose the same amount of data in a 16-bit system to noise as you would in a 24-bit system.
        Although the messing up with the gain is a pretty good reason, I stand corrected.

    2. I AM an expert and I can tell the difference, the nyquist number used for the CD audio spec was based on the frequencies we can hear as pure sine waves and does not take into account the complex phase differences human hearing can detect which would be well above 80khz if expressed as sine waves. Although I am almost 60 years old and lost my top octave a while ago I can still hear the effect of a low pass filter at 16khz on real world complex audio even though I can’t hear a sine wave at that frequency. The main problem with high quality audio is NOT expensive hardware but a lack of demand for high quality files which take up large amounts of storage space/streaming band width. I have been involved in pro audio since the early 70’s and worked with state of the art analog as well as digital pro sound gear and until the beginning of this century found analog superior to digital until 24 bit high sample rates became the norm for pro audio and I haven’t looked back since but consumer audio has gone backwards as far as fidelity is concerned because people want quantity not quality, hence the evil mp3 ubiquity. I can rant for hours on this subject but I will spare you for now. Regarding the following comment on “ultrasound” harmonics, most pro audio currently manufactured promotes the extended range frequency response as a positive feature in amplifiers and transducers.

        1. Monster cable is junk and I know all about thousand dollar AC power cables, yes there is a lot of “audiophile” hokum out there, did I say I’ve been around awhile? I once had a contract fabricating SPDIF coax cables for $18 apiece using CCTV cable that you would find in the ceiling at Kmart which some audiophile types then resold for $150 wholesale and retailed for $300 claiming it sounded better when the signal passed through it in one direction, but only after burning in for 24 hours with FM white noise. There are many engineering papers available that document extended frequency response if you care to research the subject rather than making derogatory comments on a subject you know just enough about to be dangerous. I continue to maintain that I am an expert in the field and have worked in state of the art facilities with the best equipment available and have taught audio and acoustics classes at the university level.

          1. Maybe you should read some of those engineering papers yourself. Specifically the ones that discuss oversampling, phase error of filtering, and the filters that are actually applied to a standard 44.1kHz CD played from a modern oversampling DAC. I’ll give you a clue, it’s not where near 16kHz, it’s no where near 44kHz either. You may be right about the effect it has on hearing, but you’re dead wrong on how it’s applied in the real world.

            Phase error is irrelevant on anything other than the cheapest and nastiest DACs hobbled together by Chinese kids who can’t read datasheets.

        1. Because – in theory – that is the minimum necessary to reproduce a 22khz tone which – in theory – is the upper limit of human hearing, this is not a sine wave but contains tons of ultra sonic harmonics which then need to be filtered out. Off the top of my head I can’t tell you why 44.1 rather than some even number, probably had to do with cost of available crystals or something given that the spec was developed by several corporations more interested in price point than fidelity. The other responses to my comments on phase information are off on a completely different tangent than the point I was trying to make which exists beyond the scope of exclusively digital audio concerns. If you are listening to live audio in a control room with an all analog signal path and microphones and speakers capable of fairly flat frequency response you can hear the effects of phase differences between sources in the same room and very small changes in the placement of microphones can dramatically alter the resultant sound. Sound is just changes in air pressure and exists above and below what humans can hear, the math doesn’t go on vacation just because our hearing does and the effects of that math DO show up in our hearing range. Use your head and use your ears!

          1. Theory my arse! 44.1 Khz was chosen as the sampling rate for Redbook Audio for one reason and one reason only. That is because 44,100 samples is exactly how many could be recorded per second onto PAL 3/4″ U-Matic video tape. Three samples per line x however many lines per frame x 25 frames per second = 44,100. Until not too long ago, U-Matic tapes were still the preferred master media for audio CDs. There was another, digital, tape format that had been replacing U-Matic but the industry finally admitted to itself that a CD-R was acceptable as a master, as long as it passes an error test.

            Every supposed expert on digital audio should know that. If there had been a commonly available magnetic tape format that was capable of storing 48,000 samples per second, it would have been used for the simple reason it would make everything with digital audio easier with it all having one rate. 48 Khz was the standard in the fledgling field of digital audio. Since the Redbook Audio CD was to be a consumer product, 44.1 Khz was close enough. Had CD been developed as a professional format, or if there hadn’t already been a tape capable of greater than 40 Khz sample rate storage, a 48 Khz capable tape format would have been developed.

          2. Thank you for your clarification, I still say I can hear the difference between 24/96 – 24/192 and CD quality audio, I can also hear autotune on mp3 files played on the radio over 72 volt speakers in the supermarket, in fact the effect is exagerated by all of the distortion in that siganal chain. I’m tired of being an expert, I think I’ll go to bed now.

        2. There is a low pass alti-aliasing filter at the output of the DAC. There needs to be some headroom, so don’t expect 22kHz contents in the CD recording.

          Some audio DAC requires an active filter and some has internal brickwall digital filters. The steepness of the filter determines how much attenuation of the high frequencies.

          1. The reason for anti-aliasing filters is that the higher the frequency the fewer samples you have available to represent the original frequency, regardless of bit depth. Hence with only 2 or 3 samples you get a waveform that has the fundamental frequency of the original sine wave represented by what is fundamentally a square wave which has a whole lot of harmonic content that was not present in the original signal. The higher the sample rate the more accurate the final DAC output no mater what other factors are involved.

          2. If you look at things from a time domain, the antialsiasing filter is there as an interpolation filter to smooth out the transition between the discrete samples. So instead of a stair step between the two sample in time, you have a smooth curve. i.e. without energy in the higher frequencies.

            If you have a very high sampling frequency like in oversampling DAC, you can get away with a very simple filter. Because the sampling frequency is well into the MHz, you can positioned at a high enough frequency f0, so that you get very flat gain and phrase response for the audio band.

          1. Since we were talking about oscilloscopes originally try this – connect a 44.1khz square wave to one input and a variable sine wave to the other channel. Vary the frequency and/or the time base of the sine wave and watch where it intersects the “sample” square wave. Also vary the pulse width of the square wave. This will graphically illustrate what people are talking about when they refer to “clock jitter” and “phase discrepancies” in digital audio. These phenomenon are accentuated at higher frequencies and had a lot to do with early critics of digital audio (I was one) claiming digital sounded “harsh” in the high end.

  6. Aside from the resolution issue other’s have pointed out, most oscilloscopes have a noise floor in the 1 mV range. Tek, Agilent, Rigol and others usually avoid talking much about noise floor, not to mention actually publishing useful specs, because it’s simply one of those places they hope you won’t focus on too much when deciding which scope to buy!

    Noise in the 1mV range is going to be very unsatisfying for anyone expecting high quality audio.

    1. That noise floor often makes me want to go back to our analog workhorses when dealing with small signals-especially when you have students trying to make sense of the gibberish on the digital scope.

  7. “the FLAC format can only handle a sampling rate of up to 655 kilosamples per second.”

    Don’t confuse bit rates with sampling rates. Flac is losslessly compressed audio so the bit rate has only a loose correlation with the sample rate.
    Furthermore there is no such thing as a maximum rate for flac – I couldn’t find one and as such a source citation would be helpful. I’ve seen flacs that run over 8Mbits/s (granted, they were multichannel).

    1. “FLAC supports linear sample rates from 1Hz – 655350Hz in 1Hz increments.” source

      “Don’t confuse bit rates with sampling rates” I didn’t.

      “Furthermore there is no such thing as a maximum rate for flac” Sure there is. Maximum of 8 channels, maximum of 655350 samples/sec, maximum 32 bits per sample. It’s compressed, but all you have to do is figure out a way to make the most un-compressible file.

  8. Even when assuming, that the scope frontend does not add any noise in audio band at all, you will never get high quality audio out of the 8 bit ADC.
    Oversampling would increase the resolution, but a 8 bit ADC does not have the linarity (especially INL) compared to a 16, 20 or 24 Bit ADC. So the THD value would be worse.

  9. Sorry, but not the most expensive; a quick search found that the most expensive version of that scope I could find comes to £9,330.00; another quick search found the dCS Vivaldi Upsampler, costing a total of £12,499.00, and I’m pretty sure that there’s pieces of botique low production audiophile equipment that go for even more than that. High-end audiophile kit’s stupidly expensive. If someone wrote an alternative firmware for one of those scopes, and cased it up a bit nicer, hiding anything unneccecary, they might be able to turn a profit, although they’d need to add some random valves and fancy speaker cable to get the full way there.

  10. I am not an expert, or an audiophile. I can’t even hear that well as I have a hearing aid in both ears. But this post, and the OP sounds to me (see what I did there?) like a tongue-in-cheek play on the entire digital audio sample rate argument. Sometimes you just have to listen between the waveforms.

  11. You people lack the imagination to see the real value of this: pranks. Where I last worked, we had an old crappy digital storage scope. It didn’t have much memory, the bit resolution was poor, the noise floor was high. But none of that matters if you’re just recording a profanity-laced stream of insults. Bonus points: The scope I used had a IEEE-488 interface as did my computer, and the scope allowed uploading, downloading, and triggering playback of sample memory.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.