Upgrading A Voice Recorder With A Hex Editor

[Alex] just bought a really nice TEAC VR-20 audio recorder, a very capable recorder perfect for recording your thoughts or just making concert bootlegs. This model was recently replaced by the Tascam DR-08 audio recorder. It’s essentially the same thing, but the Tascam unit can record at 96kHz, whereas the TEAC can only record at 48kHz. [Alex] figured out a way to upgrade his less capable but cheaper VR-20 to record at a higher bit rate with just a simple firmware hack.

The mod began by downloading the firmware for both the TEAC VR-20 and the Tascam DR-08. Both of these sets of firmware were exactly the same size, and after downloading a hex editor, [Alex] found a huge difference in the first 20 bytes of the firmware – the portion that tells the microcontrollers what it actually is.

The solution to improving the bitrate for the TEAC VR-20 was as simple as copying the first 20 bytes from the TEAC firmware over to the first 20 bytes of the Tascam firmware. After that, it’s a simple matter of upgrading his TEAC and getting the ability to record at 96kHz.

A very, very simple hack that’s really just flipping a few bits. Not bad for a two-fold improvement in the recording capability of a handheld audio recorder.

62 thoughts on “Upgrading A Voice Recorder With A Hex Editor

      1. The only thing the sample rate does is determine the bandwidth in the digital domain. It doesn’t do anything else.

        So yes, this hack is about bandwidth, which is bottlenecked elsewhere.

          1. You should remain from such arrogance as you obviously lack basic understanding of what semicolo wrote. He claims that there is a low pass filter (analog electronics!) before the Analog-Digital-Converter. Such a filter would restrict the signal bandwidth no matter with what sampling frequency you work.

      1. You are trying to record ultrasonics.

        If you record a sound at a normal sampling rate and then digitally pitch it down 2 octaves or slow it down to x0.25 then it will sound remarkably muffled because there won’t be anything to fill the upper part of the spectrum with.

        On the other hand if you record with high bandwidth, with an ultrasonically capable microphone and at high sample rate, you can pitch it down or slow it down and it will still sound natural because the ultrasonics will become sonic and fill in what would otherwise be a void in the audible spectrum.

        This is used by sound effects artists for movies.

        There’s also people out there recording animals that make ultrasonic noises, they tend to use specialized equipment but sometimes this does the trick.

  1. Nice hack. A comment on the editorial: using a sample rate of 96kHz does not give a two-fold improvement in the audio quality.

    48kHz is sufficient to recover every last ounce of information in the human audible range. Furthermore, some ADCs have reduced signal-to-noise performance at higher sample rates, which may be the case here. And finally, you cut your recording time in half by doubling the sample rate.

    1. Yes and no.

      48 kHz is theoretically enough, but there’s still distortion happening for frequencies just under the Nyquist frequency (24 kHz) because the do not necessarily have the same phase as the sampling clock.

      It creates a beating error in the signal that doesn’t dissapear completely until around 1/4 of the sampling rate. Technically, the beating is there for all frequencies but it just gets weaker in intensity the furter away from the Nyquist limit you go.

      1. That’s also the reason why a CD’s practical bandwidth ends at around 16 kHz instead of 22 kHz. You can record higher frequencies, but they’ll be easily distorted.

        It’s also why MP3 cuts off frequencies higher than 16 kHz. Not many CDs have anything important above that.

      2. Most of the time it does not matter that the high frequencies are gone (as long as you don’t create aliasing).

        FWIW, the human hearing degrades significantly with age. At 35 you probably have a cutoff at 17..18kHz. At 50 years you’re at about 15..16kHz. It does not get better after that…

        Also, many young people have already damaged hearing because of the high intensity from earplugs. They cannot “hear” the fine differences anymore without correcting for the damage.

        It is funny, though, that the older people get, the more expensive the gadgets become because they have more money. However, they cannot hear the difference anymore and it is like a gold-plated digital monster-cable and the big car outside.

      3. “the do not necessarily have the same phase as the sampling clock”

        Sorry but that’s crap. I hate being this blunt but you should go take Sampling Theorem 101.

        The phase of the signal doesn’t matter, it will be accurately represented as long as it’s frequency is below the Nyquist limit. The time doesn’t matter either, differences in time smaller than a sample are also accurately represented.

      4. The phase of the signal doesn’t matter, it will be accurately represented as long as it’s frequency is below the Nyquist limit.

        Nope. Sorry.

        Consider a sine signal at exactly the Nyquist frequency. You should be able to record it IF it is at the same phase as the sampling clock. If it is shifted 90 degrees off, the ADC always reads the signal at half-way up or down the sine slope. Suppose our ADC reads 202020202 in the first case. It would read 1111111111 in the second case.

        Now, if you have a frequency that is slightly lower than the Nyquist limit, it does basically the same thing, but it can be seen as drifting in and out of phase with the clock signal which creates a beat frequency, and that distorts the sound at the upper range of your recording.

        What you’re confusing is the accurate representation of the signal and the conservation of information in the signal. The information is conserved below the Nyquist limit, but it is distorted due to interaction with the sampling clock frequency.

        And that is basic sampling theory.

      5. @Dax: You’re not fundamentally wrong, but your understanding of the sampling theorem is incomplete.

        First of all a signal at the Nyquist frequency is invalid because it violates the Nyquist limit which says any valid signal must be below it, not at, so we’ll ignore that case.

        For a signal slightly below the Nyquist limits you’re looking at the wrong numbers. The stored sample values are not the same as the continuous wave that will come out of the DAC output lowpass which is also known as “reconstruction filter”. For example if you synthetize a sine wave of, say, 15kHz and then look at the sample values in a text editor you’ll see they look nothing like a sine and are largely asymmetric. If, however, you play that same sound through an actual DAC, with a reconstruction filter, and hook the output to an oscilloscope you’ll see that the output is, in fact, a perfectly symmetric sine wave.

        That’s the magic of the reconstruction filter. A perfect reconstruction filter will take those pulsing values and give you a perfectly smooth output. It’s counterintuitive, but it works and has been proved in theory and in practice. Speaking of practice there is a catch: Real reconstruction filters aren’t mathematically perfect, and this means they do have a transition band which presents this beating you mention. However this band is small, most chip manufacturers make it 2kHz-wide so the effective response of the chip *is* from ~5Hz to 20kHz. I think that’s pretty reasonable. Far from the 16kHz you mention. And nobody is -6dB at 20k at 35, there’s way more loss even if all you do is stand still (natural stiffening of the cochlea)…

      6. @Dax: Took a look at the NASA paper just in case, and it is in full agreement with what I said.

        I quote page 3:

        Rather, the distortions occur because most cost-effective signal waveform playback and display technologies leave out the complete waveform reconstruction required in accordance with the Sampling Integral.

      7. I quote again your own NASA paper, again page 3:

        Shannon’s theorem can be paraphrased as “So long as the signal is band-limited to less than the Nyquist frequency, it can be completely reconstructed without any distortion.”

  2. Higher sampling rates can offer certain advantages, for example cutting down significantly on distortion during digital editing processes such as filtering/EQ and time manipulation.

    But I am also curious whether this hack actually changes the sampling rate of the ADC or if it just effectively doubles the samples in the recording because the MCU thinks it is sampling at a higher rate. And if the actual sampling rate does really increase, there could still be hardware limitations preventing the device from truly recording at 96 kHz (such as the LPF mentioned by semicolo).

    1. “cutting down significantly on distortion during digital editing processes”

      Except it doesn’t. There’s only one edit that really benefits from higher bandwidth recording: Pitch-shifting – and only when you pitch down, that is. Sound designers (as in sound effects for TV and movies) do this often so they do genuinely have a reason to use high bandwidth, but that’s about it.

      “But I am also curious whether this hack actually changes the sampling rate of the ADC”

      It does change the sampling rate of the ADC, otherwise you’d get a mess in the recording because surely the microcontroller doesn’t have it’s own resampler.

      The hardware limitations may or may not be in the ADC itself, there are some chips that have the antialiasing filter fixed at ~20kHz regardless of sampling rate, and there are others that move it up to ~40kHz when recording at 96kHz. It depends on the specific part. However even if the ADC is really recording 96kHz, it is very unlikely the microphone capsule itself will have a decent response up there, if it’s directional then the directionality mechanism is sure to seriously mess with the ultrasonic response, and if it’s omnidirectional then the body of the recorder probably gets in the way. Microphones with a genuine ultrasonic response have specific body shapes for this reason.

  3. Both of these sets of firmware were exactly the same size, and after downloading a hex editor, [Alex] found a huge difference in the first 20 bytes of the firmware – the portion that tells the microcontrollers what it actually is.

    The solution to improving the bitrate for the TEAC VR-20 was as simple as copying the first 20 bytes from the TEAC firmware over to the first 20 bytes of the Tascam firmware.

    If the two binaries really are exactly the same except for the first 20 bytes, why copy those 20 bytes from the TEAC to the Tascam? Why not just use the Tascam firmware directly?

    1. Just a guess here (because I don’t own either and don’t want to download the two firmwars), but the two binaries are probably different (but compatible, and fairly similar). The first 20 bytes have to be copied over so that the device sees the correct header data – otherwise it would refuse to install.

    2. because

      Both of these sets of firmware were exactly the same size, and after downloading a hex editor, [Alex] found a huge difference in the first 20 bytes of the firmware – the portion that tells the microcontrollers what it actually is.

    3. I was wondering the exact same thing, I bet if you where to take both units apart they have the exact same components. Many companies have resorted to this type of tactic to make more money. But it’s still a cool hack, would be more interesting to find out what capabilities the chips have and see if you can enable / upgrade other features.

      1. I knew of a major Hard Drive company that did the same thing. It was cheaper to build the hard drives all the same and then with firmware, or sometimes a cut trace, change the drive specs. That way the customer had a choice, and most bought the middle of the road version.

  4. nice hack however unless you are indending on burning to sacd http://en.wikipedia.org/wiki/Sacd

    or the recorder can record more than 1 or 2 channels (most recorders for voice are mono) the higher sampling rate is overhill.

    the reason i suspect in today’s cds that the quality is going down is because the artists think piracy is causing them to not want to make quality or they cant afford quality.

    most sound editors will up/downsample to conform to format

  5. In respect to audio the only real advantage to oversampling (which using 96kHz for ausio is), and it is admittedly a significant one, is to reduce the requirements on the reconstruction filter. This in turn reduces the artefacts/distortion etc associated with the physical problems of creating a near brickwall filter, especially ringing.

  6. In my personal experience, editing in a higher sample rate environment is definitely helpful. It’s really noticeable in effects that add harmonics such as distortion. The higher bandwidth helps prevent the added high-frequency harmonics from aliasing and showing up as low-frequency digital artifacts. Even if my source audio is 44.1K I upsample it to 96K, edit it, and then downsample to 44.1K for the CD. I don’t know how well this recorder captures the ultrasonic, but hey, cool hack regardless.

  7. I suspect that some of our fellow commenters might have been thrown of by the description of recording at 48Khz and increasing to 98Khz as an increase in the frequency range the device can record. When discussing sample rates I prefer using Ks (Kilo Samples)or Gs (Giga Sample) so as not to mix up frequency range and sample rate.
    It is interesting that they both use the same software except for the first 20 bytes. I do wonder though whether the ADC can convert fast enough to record at 98,000 samples per second otherwise the controller will just be waiting on the ADC to finish a conversion resulting in a lower record sample rate.

  8. Despicable.
    “Hey, let’s build a device capable of something, but then not enable it in the software! That way we can design a new casing and later release a new version of it and wring out more profit from the same design at the expense of the consumer!”

    This is why it should be illegal to release closed source software.

    1. Very common in whitegoods & other domestic items.

      A manufacturer will release a range of products, say a washing machine. Sometimes the only difference between the models is the faceplate for the controls. The lower priced model is missing the buttons to select the options the higher priced one ‘has’.

      If you dismantle it you’ll see where the switches are supposed to go, and so on. Many hacks on this site are just that, how to enable the ‘missing’ features.

      Software is the same – Windows 7 being a good example. Everyone gets the same version, but higher priced key simply unlocks more features.

      1. A lot of electronics do that so the retailers never have to do price matching — all the models are company exclusives. WalMart will have TV Model 1001, Sam’s Club will have 1002, Best Buy will have 1003, etc. There are subtle differences, the 1003 may be the ‘delux’ model with better performance specs, the 1002 will be a discount model with some lower end performance specs. You can’t ever go to one store and ask them to price match a competitor’s sale, becuase they are slighly different. Under the hood, the often have the same control board with different jumpers.

      2. Same model with a different stick to avoid price matching is slightly different

        But yeah, the ‘standard’ and ‘deluxe’ models (regardless of the sticker), are often the same as well except for a few missing buttons, like this bloke found with the firmware.

  9. All this talk about digital recording reminds me of the reason why audio CDs are encoded at the odd rate of 44.1K instead of something more even like 48K.

    It has absolutely nothing to do with Nyquist frequency or quantization or anything else like that.

    44.1K was chosen simply because it was the most data they could get per video frame on the 3/4″ U-Matic video tape recorders that were hacked into the first digital audio recorders.

    Same sort of crazy reason why the original standard maximum time of a CD was 74 minutes, because Sony president Noiro Ogha said it “would be able to encompass an entire opera or all of Beethoven’s 9th Symphony”.

    Same reason why the various “books” for optical storage are called Red Book, Yellow Book, Orange Book, etc. Those are the colors of the folders the industry people happened to grab hold of to hold the original copies of the finalized standards. Probably wasn’t any plan to the colors, was most likely whichever color was next in the package of folders.

    The circumstances behind the name of the High Sierra format are a little less mundane. The meeting to design it was held at the High Sierra Hotel and Casino (currently called the Horizon Casino) near Lake Tahoe, California.

    1. Indeed you are correct – and it is a short sighted decision that has cost the world BILLIONS of dollars over the years – the cost of resampling from an non-integer rate is enormous in any unit you wish to measure it.

      1. @anon : unless you consider that removing one sample every two samples is not really resampling, as you do not create any new sample by interpolating. 88.2 to 44.1 removes samples, 48 to 44.1 creates new samples.

      2. Resampling: put zeros between the original samples at a rate of the least common multiple of original and target sample rate [upsampling]. Then use a brick wall lowpass filter (at half target rate) and pick samples at the target rate [downsampling]. Audio rates as 16, 24, 32, 48, 96kHz allow easy (= cheap) conversion due to lower LCM clock specs. 44.1kHz and derivatives increase the clock specs and cost.

  10. I wish Tascam would offer an updated firmware for my voice recorder…or at least a download link for the stock one so I could tinker with it. Stupid thing stops recording after 14.5 hours, but has enough memory to record for a solid week. Really sucks when trying to record an all-weekend music festival.

  11. Now if only I could do this with my Olympus W400-S. Not that I need any more sample rate (better mics before that), but I’d like to be able to record WAV. The stupid thing forces WMA on you. Also being able to turn off the dynamic range compressor would be nice. All I want is less features, Olympus!

    I only got so far as determining it has a TI TMS320 powering the thing – I don’t think I found the firmware anywhere. Now that I’ve picked up the C2000 Launchpad I should have a debugger for that chip! Time to start poking!

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.