Upgrading a voice recorder with a hex editor

[Alex] just bought a really nice TEAC VR-20 audio recorder, a very capable recorder perfect for recording your thoughts or just making concert bootlegs. This model was recently replaced by the Tascam DR-08 audio recorder. It’s essentially the same thing, but the Tascam unit can record at 96kHz, whereas the TEAC can only record at 48kHz. [Alex] figured out a way to upgrade his less capable but cheaper VR-20 to record at a higher bit rate with just a simple firmware hack.

The mod began by downloading the firmware for both the TEAC VR-20 and the Tascam DR-08. Both of these sets of firmware were exactly the same size, and after downloading a hex editor, [Alex] found a huge difference in the first 20 bytes of the firmware – the portion that tells the microcontrollers what it actually is.

The solution to improving the bitrate for the TEAC VR-20 was as simple as copying the first 20 bytes from the TEAC firmware over to the first 20 bytes of the Tascam firmware. After that, it’s a simple matter of upgrading his TEAC and getting the ability to record at 96kHz.

A very, very simple hack that’s really just flipping a few bits. Not bad for a two-fold improvement in the recording capability of a handheld audio recorder.

Comments

  1. semicolo says:

    Hmmm, what about the ~24KHz low pass filter that’s supposed to sit before the ADC? Won’t higher frequencies be filtered out rendering this hack useless?

    • termm says:

      96KHz is the sampling rate.
      This hack isn’t about the bandwidth of the recorder.

      • anon says:

        The only thing the sample rate does is determine the bandwidth in the digital domain. It doesn’t do anything else.

        So yes, this hack is about bandwidth, which is bottlenecked elsewhere.

        • draeath says:

          I suggest you Google “nyquist” and start reading, you obviously lack some understanding of what is going on.

          • Hans says:

            You should remain from such arrogance as you obviously lack basic understanding of what semicolo wrote. He claims that there is a low pass filter (analog electronics!) before the Analog-Digital-Converter. Such a filter would restrict the signal bandwidth no matter with what sampling frequency you work.

    • Alex Rossie says:

      This.

    • John says:

      What do you mean by higher frequencies? Given that the ear can’t hear about about 20k, what are you trying to record above that?

      • anon says:

        You are trying to record ultrasonics.

        If you record a sound at a normal sampling rate and then digitally pitch it down 2 octaves or slow it down to x0.25 then it will sound remarkably muffled because there won’t be anything to fill the upper part of the spectrum with.

        On the other hand if you record with high bandwidth, with an ultrasonically capable microphone and at high sample rate, you can pitch it down or slow it down and it will still sound natural because the ultrasonics will become sonic and fill in what would otherwise be a void in the audible spectrum.

        This is used by sound effects artists for movies.

        There’s also people out there recording animals that make ultrasonic noises, they tend to use specialized equipment but sometimes this does the trick.

  2. Mark says:

    Nice hack. A comment on the editorial: using a sample rate of 96kHz does not give a two-fold improvement in the audio quality.

    48kHz is sufficient to recover every last ounce of information in the human audible range. Furthermore, some ADCs have reduced signal-to-noise performance at higher sample rates, which may be the case here. And finally, you cut your recording time in half by doubling the sample rate.

    • Dax says:

      Yes and no.

      48 kHz is theoretically enough, but there’s still distortion happening for frequencies just under the Nyquist frequency (24 kHz) because the do not necessarily have the same phase as the sampling clock.

      It creates a beating error in the signal that doesn’t dissapear completely until around 1/4 of the sampling rate. Technically, the beating is there for all frequencies but it just gets weaker in intensity the furter away from the Nyquist limit you go.

      • Dax says:

        That’s also the reason why a CD’s practical bandwidth ends at around 16 kHz instead of 22 kHz. You can record higher frequencies, but they’ll be easily distorted.

        It’s also why MP3 cuts off frequencies higher than 16 kHz. Not many CDs have anything important above that.

      • Bertho says:

        Most of the time it does not matter that the high frequencies are gone (as long as you don’t create aliasing).

        FWIW, the human hearing degrades significantly with age. At 35 you probably have a cutoff at 17..18kHz. At 50 years you’re at about 15..16kHz. It does not get better after that…

        Also, many young people have already damaged hearing because of the high intensity from earplugs. They cannot “hear” the fine differences anymore without correcting for the damage.

        It is funny, though, that the older people get, the more expensive the gadgets become because they have more money. However, they cannot hear the difference anymore and it is like a gold-plated digital monster-cable and the big car outside.

      • Dax says:

        Women don’t typically have the same hearing range supression as men, and it’s not a hard limit. At 35 you’re at something like -6 dB at 20 kHz.

      • anon says:

        “the do not necessarily have the same phase as the sampling clock”

        Sorry but that’s crap. I hate being this blunt but you should go take Sampling Theorem 101.

        The phase of the signal doesn’t matter, it will be accurately represented as long as it’s frequency is below the Nyquist limit. The time doesn’t matter either, differences in time smaller than a sample are also accurately represented.

      • Dax says:

        The phase of the signal doesn’t matter, it will be accurately represented as long as it’s frequency is below the Nyquist limit.

        Nope. Sorry.

        Consider a sine signal at exactly the Nyquist frequency. You should be able to record it IF it is at the same phase as the sampling clock. If it is shifted 90 degrees off, the ADC always reads the signal at half-way up or down the sine slope. Suppose our ADC reads 202020202 in the first case. It would read 1111111111 in the second case.

        Now, if you have a frequency that is slightly lower than the Nyquist limit, it does basically the same thing, but it can be seen as drifting in and out of phase with the clock signal which creates a beat frequency, and that distorts the sound at the upper range of your recording.

        What you’re confusing is the accurate representation of the signal and the conservation of information in the signal. The information is conserved below the Nyquist limit, but it is distorted due to interaction with the sampling clock frequency.

        And that is basic sampling theory.

      • Dax says:
      • anon says:

        @Dax: You’re not fundamentally wrong, but your understanding of the sampling theorem is incomplete.

        First of all a signal at the Nyquist frequency is invalid because it violates the Nyquist limit which says any valid signal must be below it, not at, so we’ll ignore that case.

        For a signal slightly below the Nyquist limits you’re looking at the wrong numbers. The stored sample values are not the same as the continuous wave that will come out of the DAC output lowpass which is also known as “reconstruction filter”. For example if you synthetize a sine wave of, say, 15kHz and then look at the sample values in a text editor you’ll see they look nothing like a sine and are largely asymmetric. If, however, you play that same sound through an actual DAC, with a reconstruction filter, and hook the output to an oscilloscope you’ll see that the output is, in fact, a perfectly symmetric sine wave.

        That’s the magic of the reconstruction filter. A perfect reconstruction filter will take those pulsing values and give you a perfectly smooth output. It’s counterintuitive, but it works and has been proved in theory and in practice. Speaking of practice there is a catch: Real reconstruction filters aren’t mathematically perfect, and this means they do have a transition band which presents this beating you mention. However this band is small, most chip manufacturers make it 2kHz-wide so the effective response of the chip *is* from ~5Hz to 20kHz. I think that’s pretty reasonable. Far from the 16kHz you mention. And nobody is -6dB at 20k at 35, there’s way more loss even if all you do is stand still (natural stiffening of the cochlea)…

      • anon says:

        @Dax: Took a look at the NASA paper just in case, and it is in full agreement with what I said.

        I quote page 3:

        Rather, the distortions occur because most cost-effective signal waveform playback and display technologies leave out the complete waveform reconstruction required in accordance with the Sampling Integral.

      • Dax says:

        Nope. The Nyquist frequency is a valid frequency. Check any textbook. It’s below or at the limit.

      • anon says:

        I quote again your own NASA paper, again page 3:

        Shannon’s theorem can be paraphrased as “So long as the signal is band-limited to less than the Nyquist frequency, it can be completely reconstructed without any distortion.”

  3. Cricri says:

    Exactly the kind of thing I expect to find here. Great stuff!

  4. nate says:

    Higher sampling rates can offer certain advantages, for example cutting down significantly on distortion during digital editing processes such as filtering/EQ and time manipulation.

    But I am also curious whether this hack actually changes the sampling rate of the ADC or if it just effectively doubles the samples in the recording because the MCU thinks it is sampling at a higher rate. And if the actual sampling rate does really increase, there could still be hardware limitations preventing the device from truly recording at 96 kHz (such as the LPF mentioned by semicolo).

    • anon says:

      “cutting down significantly on distortion during digital editing processes”

      Except it doesn’t. There’s only one edit that really benefits from higher bandwidth recording: Pitch-shifting – and only when you pitch down, that is. Sound designers (as in sound effects for TV and movies) do this often so they do genuinely have a reason to use high bandwidth, but that’s about it.

      “But I am also curious whether this hack actually changes the sampling rate of the ADC”

      It does change the sampling rate of the ADC, otherwise you’d get a mess in the recording because surely the microcontroller doesn’t have it’s own resampler.

      The hardware limitations may or may not be in the ADC itself, there are some chips that have the antialiasing filter fixed at ~20kHz regardless of sampling rate, and there are others that move it up to ~40kHz when recording at 96kHz. It depends on the specific part. However even if the ADC is really recording 96kHz, it is very unlikely the microphone capsule itself will have a decent response up there, if it’s directional then the directionality mechanism is sure to seriously mess with the ultrasonic response, and if it’s omnidirectional then the body of the recorder probably gets in the way. Microphones with a genuine ultrasonic response have specific body shapes for this reason.

  5. mziwisky says:

    Both of these sets of firmware were exactly the same size, and after downloading a hex editor, [Alex] found a huge difference in the first 20 bytes of the firmware – the portion that tells the microcontrollers what it actually is.

    The solution to improving the bitrate for the TEAC VR-20 was as simple as copying the first 20 bytes from the TEAC firmware over to the first 20 bytes of the Tascam firmware.

    If the two binaries really are exactly the same except for the first 20 bytes, why copy those 20 bytes from the TEAC to the Tascam? Why not just use the Tascam firmware directly?

  6. ejonesss says:

    nice hack however unless you are indending on burning to sacd http://en.wikipedia.org/wiki/Sacd

    or the recorder can record more than 1 or 2 channels (most recorders for voice are mono) the higher sampling rate is overhill.

    the reason i suspect in today’s cds that the quality is going down is because the artists think piracy is causing them to not want to make quality or they cant afford quality.

    most sound editors will up/downsample to conform to format

  7. Tom says:

    40KHz or so is theoretically fine for recording everything humans can hear, but it doesn’t give any head room for resampling or filtering.

  8. garym53 says:

    In respect to audio the only real advantage to oversampling (which using 96kHz for ausio is), and it is admittedly a significant one, is to reduce the requirements on the reconstruction filter. This in turn reduces the artefacts/distortion etc associated with the physical problems of creating a near brickwall filter, especially ringing.

  9. sean says:

    In my personal experience, editing in a higher sample rate environment is definitely helpful. It’s really noticeable in effects that add harmonics such as distortion. The higher bandwidth helps prevent the added high-frequency harmonics from aliasing and showing up as low-frequency digital artifacts. Even if my source audio is 44.1K I upsample it to 96K, edit it, and then downsample to 44.1K for the CD. I don’t know how well this recorder captures the ultrasonic, but hey, cool hack regardless.

  10. George Johnson says:

    Bootleg concert? Yeah, nothing like listening to a concert in mono. Hey! let’s listen to AM radio!

  11. fdsf says:

    “Nice hack”, except that they’re exactly the same cost. Just buy the not-discontinued unit, which probably had other improvements rolled into it.

  12. Wm_Atl says:

    I suspect that some of our fellow commenters might have been thrown of by the description of recording at 48Khz and increasing to 98Khz as an increase in the frequency range the device can record. When discussing sample rates I prefer using Ks (Kilo Samples)or Gs (Giga Sample) so as not to mix up frequency range and sample rate.
    It is interesting that they both use the same software except for the first 20 bytes. I do wonder though whether the ADC can convert fast enough to record at 98,000 samples per second otherwise the controller will just be waiting on the ADC to finish a conversion resulting in a lower record sample rate.

  13. Onaka says:

    Despicable.
    “Hey, let’s build a device capable of something, but then not enable it in the software! That way we can design a new casing and later release a new version of it and wring out more profit from the same design at the expense of the consumer!”

    This is why it should be illegal to release closed source software.

    • Tony says:

      Very common in whitegoods & other domestic items.

      A manufacturer will release a range of products, say a washing machine. Sometimes the only difference between the models is the faceplate for the controls. The lower priced model is missing the buttons to select the options the higher priced one ‘has’.

      If you dismantle it you’ll see where the switches are supposed to go, and so on. Many hacks on this site are just that, how to enable the ‘missing’ features.

      Software is the same – Windows 7 being a good example. Everyone gets the same version, but higher priced key simply unlocks more features.

      • Steve0 says:

        A lot of electronics do that so the retailers never have to do price matching — all the models are company exclusives. WalMart will have TV Model 1001, Sam’s Club will have 1002, Best Buy will have 1003, etc. There are subtle differences, the 1003 may be the ‘delux’ model with better performance specs, the 1002 will be a discount model with some lower end performance specs. You can’t ever go to one store and ask them to price match a competitor’s sale, becuase they are slighly different. Under the hood, the often have the same control board with different jumpers.

      • Tony says:

        Same model with a different stick to avoid price matching is slightly different

        But yeah, the ‘standard’ and ‘deluxe’ models (regardless of the sticker), are often the same as well except for a few missing buttons, like this bloke found with the firmware.

  14. t&p says:

    errrrr that is some balls right there. I don’t know much about this device but messing with different firmware and force installing could have bricked it.

  15. lamer says:

    what? No Arduino ;)
    a real hack.. Shocked..
    Good stuff ;)

    /Lamer

  16. Galane says:

    All this talk about digital recording reminds me of the reason why audio CDs are encoded at the odd rate of 44.1K instead of something more even like 48K.

    It has absolutely nothing to do with Nyquist frequency or quantization or anything else like that.

    44.1K was chosen simply because it was the most data they could get per video frame on the 3/4″ U-Matic video tape recorders that were hacked into the first digital audio recorders.

    Same sort of crazy reason why the original standard maximum time of a CD was 74 minutes, because Sony president Noiro Ogha said it “would be able to encompass an entire opera or all of Beethoven’s 9th Symphony”.

    Same reason why the various “books” for optical storage are called Red Book, Yellow Book, Orange Book, etc. Those are the colors of the folders the industry people happened to grab hold of to hold the original copies of the finalized standards. Probably wasn’t any plan to the colors, was most likely whichever color was next in the package of folders.

    The circumstances behind the name of the High Sierra format are a little less mundane. The meeting to design it was held at the High Sierra Hotel and Casino (currently called the Horizon Casino) near Lake Tahoe, California.

    • garym53 says:

      Indeed you are correct – and it is a short sighted decision that has cost the world BILLIONS of dollars over the years – the cost of resampling from an non-integer rate is enormous in any unit you wish to measure it.

      • dwan says:

        44100 samples per second is an integer. If you have to downsample audio to 44.1kHz, you should record at 88.2kHz in the first place. Many audio interfaces allow this sample rate.

      • anon says:

        Filters don’t care what frequency they’re set to, recording at twice the target has no advantages vs. recording at 2.088435374x the target. Resampling is resampling.

      • anon says:

        Woops, messed that up (coffee not in yet), it’s 2.176870748x ;-)

      • dwan says:

        @anon : unless you consider that removing one sample every two samples is not really resampling, as you do not create any new sample by interpolating. 88.2 to 44.1 removes samples, 48 to 44.1 creates new samples.

      • anon says:

        @dwan: Doing that violates the Nyquist limit so it gets no sympathy from me. If you add a lowpass to not violate the Nyquist limit then you actually are resampling.

      • Stefaan says:

        Resampling: put zeros between the original samples at a rate of the least common multiple of original and target sample rate [upsampling]. Then use a brick wall lowpass filter (at half target rate) and pick samples at the target rate [downsampling]. Audio rates as 16, 24, 32, 48, 96kHz allow easy (= cheap) conversion due to lower LCM clock specs. 44.1kHz and derivatives increase the clock specs and cost.

  17. KillerBug says:

    I wish Tascam would offer an updated firmware for my voice recorder…or at least a download link for the stock one so I could tinker with it. Stupid thing stops recording after 14.5 hours, but has enough memory to record for a solid week. Really sucks when trying to record an all-weekend music festival.

  18. Barnes says:

    Coincidentally Amazon is doing a lighting special on the white VR-20 at 6:30 AM PST. You need to search for ‘tascam vr-20w’

  19. jondaddio says:

    Nice hack! So simple.

  20. jondaddio says:

    Nice hack! Oh, and thanks @Barnes for the Amazon lightning special heads-up. I just grabbed a white VR-20 for $35.99.

  21. boondaburrah says:

    Now if only I could do this with my Olympus W400-S. Not that I need any more sample rate (better mics before that), but I’d like to be able to record WAV. The stupid thing forces WMA on you. Also being able to turn off the dynamic range compressor would be nice. All I want is less features, Olympus!

    I only got so far as determining it has a TI TMS320 powering the thing – I don’t think I found the firmware anywhere. Now that I’ve picked up the C2000 Launchpad I should have a debugger for that chip! Time to start poking!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 92,435 other followers