Digital Audio Stair-Step myth (or Old Man Yells At Cloud)

ericmedley · June 2016

I know I'm an old curmudgeon RCA trained audio engineer and AES officer for over 30 years and I resent having to take all the AMX beginners courses in Audio and blah blah blah...

but, while watching the AMX video there was a repetition of a myth in digital audio that us engineers have been trying to dispel as long as digital audio have been around: that being the "Stair Step" graph of a digital waveform. The idea being that digital audio is represented as a series of stair steps with horizontal flat sections between sample points.

This is erroneous and not what it actually happening. They talk about the concept of Interpolation as being an method of removing the stair step waveform into something more natural and analog-ish.

what most folks don't seem to realize is that the most important part of digital audio is actually the conversion to and from analog to/from digital. No A/D or D/A converter ever made can produce a true square wave from sample point A to B. Instead, there is a smooth curve based upon the raising and/or lowering or even staying the same amplitude. The voltage change ΔVolts has to it a curve which is very similar to an inertial curve. The end result is not a jagged or stair-steppy waveform. Even with a cheap/bad D/A converter you don't see a jagged waveform.

What causes the "harsh digital" sound most folks experienced with early converters had a lot to do with poorly designed electronics responsible for creating the analog voltage. They were cheaply made because they had to fit on a chip and used very poor quality components with lots of harmonic distortion and horrible signal-to-noise ratios. (Not to mention poor noise rejection being that they were living inside a digital processor producing tons of EM noise)

Modern A/D and D/A converters do a much better job of interpolation and have much better noise rejection and signal-to-noise ratios. but, all of this is still the analog part of the chain. The digital side has changed little in 25 years.

And while I'm being an audio curmudgeon - higher sample rates are a gimmick. I master records as part of my living and I receive files all the time at 96K, 192K, etc... Trust me - there is no audio information in them above 22-23kHz.

everything above that is just white noise. It's actually lowering the signal-to-noise ratio of the audio chain which is what we all perceive as good or bad quality. normal 44.1kHz is plenty of frequency coverage for us humans. (0-22kHz) and if you're an audiophile I can say that 48kHz sample rate almost always gets all the high frequency that is actually recorded. (0-24kHz)

If you truly want to make the listening experience more glorious, go with a higher bit depth. 24 bits has a noticeably clearer sound quality than 16. I've done Analog vs. Digital side by side shootouts with AES members where we blind tested a 24-bit/44.1kHz digital recording playback with an ATR-24 1" reel to reel (2-channel) analog machine. It was nearly impossible to tell the difference reliably.

And lastly, GET OFF MY LAWN.

chill · June 2016

Hear, hear. I was an RCA video guy. Remember the glorious TR-70.

And instead of getting off my lawn, go get the mower and make yourselves useful.

Morris · June 2016

I don't have anywhere near the audio experience you have (private studio for 15 years, trained at uni in DSP) so I bow to your superior knowledge and experience but I would have thought that a second of analogue audio that was converted to 192000 samples compared to 44100 samples would start its digital life much cleaner/clearer since it has more than 2x the samples to represent the same second of audio. Of course how this relates to the auditory perception or 'quality' of the sound depends on what you are recording/mixing/mastering.

higher sample rates are a gimmick

In some of the tests I have conducted I heard an astounding difference between 44.1kHz and 192kHz, mostly in the higher frequencies above about 8kHz. The most noticeable test was the recording of a grand piano with a Rode NT1A through a ProTools HD1 system, the only thing that changed was the sample rate as this is what I was testing and side by side I could definitely hear a difference. Yet the same test with a drum kit only showed a minor lift to the high end brightness and much less of a perceptible difference.

If you have the time I would like to know your opinion of what I might have been hearing (was it due to extra noise in the extreme ranges lifting the brightness somehow or just better resolution in the higher frequencies? It was subjective listening so ppl in the studio heard different things but most commented on an increased brightness).

IRL, 48kHz is plenty IMO. No need for massively high sample rates unless the job requires them for some strange reason. I totally agree about bit depth. 24 is the go. I never run my converters at 16. Everything gets pushed to 32 in my DAW when doing any processing (plugins, eq etc) and I figure (math aside) that the closer to 32 it starts out the better off I am.

Also, I agree that teaching people the "stair step" way is not good and can lead to misunderstandings in some of the more advanced topics such as interpolation. The graph would be much better represented using single sample points ala Udo Zolzer's Digital Audio Signal Processing book.

ericmedley · June 2016

Morris wrote: »

I don't have anywhere near the audio experience you have (private studio for 15 years, trained at uni in DSP) so I bow to your superior knowledge and experience but I would have thought that a second of analogue audio that was converted to 192000 samples compared to 44100 samples would start its digital life much cleaner/clearer since it has more than 2x the samples to represent the same second of audio. Of course how this relates to the auditory perception or 'quality' of the sound depends on what you are recording/mixing/mastering.

In some of the tests I have conducted I heard an astounding difference between 44.1kHz and 192kHz, mostly in the higher frequencies above about 8kHz. The most noticeable test was the recording of a grand piano with a Rode NT1A through a ProTools HD1 system, the only thing that changed was the sample rate as this is what I was testing and side by side I could definitely hear a difference. Yet the same test with a drum kit only showed a minor lift to the high end brightness and much less of a perceptible difference.

If you have the time I would like to know your opinion of what I might have been hearing (was it due to extra noise in the extreme ranges lifting the brightness somehow or just better resolution in the higher frequencies? It was subjective listening so ppl in the studio heard different things but most commented on an increased brightness).

IRL, 48kHz is plenty IMO. No need for massively high sample rates unless the job requires them for some strange reason. I totally agree about bit depth. 24 is the go. I never run my converters at 16. Everything gets pushed to 32 in my DAW when doing any processing (plugins, eq etc) and I figure (math aside) that the closer to 32 it starts out the better off I am.

Also, I agree that teaching people the "stair step" way is not good and can lead to misunderstandings in some of the more advanced topics such as interpolation. The graph would be much better represented using single sample points ala Udo Zolzer's Digital Audio Signal Processing book.

It's a valid question and an interesting thing to study. Oddly enough frequency response has little to do with what we perceive as "clear" sound. What seems to be more a factor is the absence of noise and distortion. If you are hearing a sound source and all of its attendant overtones, your ear is generally pleased. This is accomplished more than well enough within our spectrum of hearing (20-20kHz)

There are plenty of devices out there that operate at 192kHz sampling rate (with an effective frequency response of 0-380kHz) that sound awful. Sample rate only details the possible frequency ramge of a given system. Sample rate is the horizontal divisions of a signal over time. Since the D/A converter is recreating the analog voltage waveform along with its natural tendency to recreate the voltage curve between sample points just like a good ole analog circuit, a 20kHz waveform at 44.1kHz sample rate looks exactly like one at 192kHz. There is really no measurable difference.

We can go into the more esoteric discussion of " but what about the stuff happening above our hearing range and its attendent sub-harmonic effects within our spectru..." stuff. That's certainly a possibility.

But, the real audible difference occurs with the bit depth. Bit depth is the number of divisions vertically on the wave form. (how many ticks between volumes -100% through +100%) The ends up dictating the overall signal-to-noise ratio of the circuit. With the best anlaog recording gear a signal to noise ratio of 95db to 102db is the best you can hope for. My ATR-24 1/2" 2-track is about the best there is StoN and it's very pleasant to listen to.

But, a 16-bit uncompressed audio file has a StoN ratio of 108db. That's an order of magnitude more dynamic range. The extremely quite sections of a recording (like jazz or classical) don't end up getting lost in the noise floor as much since it's way more accurate very close to the zero volume line.

Now, one step further - go to 24 bits and you've increased your dynamic range all the way out to 136db. This is again orders of magnitude more room from dead quiet to full volume. You can hear even more detail of low volume subtlety.

I prove this point to my audio students all the time with some microphones I have. I have two really good tube mics. One has a frequency range considerably lower than the other. (20-20kHz vs. 20-16kHz)
Yet they consistently choose the lower frequency range mic as sounding "clearer" and "more airy" When we look at the wave forms on a scope they look pretty much the same (with obvious minor differnces) But what becomes clear on the scope is the "better range" mic has a significantly higher noise floor.

I also know that when I play 16 vs. 24 bit files, I can get a more consistent response from non-audio types that it sounds noticeably better.

ericmedley · June 2016

Here's a great "technical" video that does a great job explaining it.
https://www.youtube.com/watch?v=cIQ9IXSUzuM

https://www.youtube.com/watch?v=cIQ9IXSUzuM

a_riot42 · June 2016

I think we do hear the higher frequencies above 20khz. We might not be hearing it with the same mechanism as lower frequencies but I still think we hear them. Perhaps we can't hear a single high freq but when there are multiple high freq's at the same time, you can likely hear the interference between them if not the actual freq.
Paul

ericmedley · June 2016

a_riot42 wrote: »

I think we do hear the higher frequencies above 20khz. We might not be hearing it with the same mechanism as lower frequencies but I still think we hear them. Perhaps we can't hear a single high freq but when there are multiple high freq's at the same time, you can likely hear the interference between them if not the actual freq.
Paul

a_Riot42,
I'm with you on that. on a normal CD with the Nyquist folding is around 22K. So all audio mastered for 44.1 is band limited at right around 21kHz. I do think there is some harm being done there. (not as much as what happens when you down sample from higher rates mind you) But, it's still there.

I'm pretty sure I can't hear much above 16-18kHz. but, I can notice a loss of 'air' if I am mastering something with a roll off around 19-20kHz. Personally, I find that 48K is plenty of sampling rate to get almost everything. My most well recorded stuff I've mastered (a guy who recorded a big city orchestra of fame at 9624) with a pair of Bruel & Kjaer mics had some measurable stuff going on up to around 24/25 kHz. Everything above that was just noise floor.

I also agree with you that the interaction of stuff up there does cascade downward to within our hearing range. (subharmonics and comb filtering and so forth)

Does the average listening hear this stuff??? I dunno... I'm of the opinion that the vast majority of folks don't give a crap about sound quality. The vast majority of music is listened to on compressed streams or MP3s which sounds like a$$. But, that's just me being crotchety again.

Morris · June 2016

Thanks for your explanation and the excellent video! This is very interesting stuff.

You may be crotchety but your 100% correct. Most people nowadays don't care and MP3s do sound like a$$! Unfortunately many ppl now have grown up listening to them, instead of hearing the great vinyl or tape recordings of the past.

Darkside · June 2016

Eric, that is such a great vid - and I have to admit, my eyebrows raised a little more than once at a couple of points (as I found myself questioning my own understanding!).

Does the average listening hear this stuff??? I dunno... I'm of the opinion that the vast majority of folks don't give a crap about sound quality. The vast majority of music is listened to on compressed streams or MP3s which sounds like a$$.

Good point! The other thing not discussed (different topic) is it is all very well having pure or near pure sources, but, (and I bet you know where I am going here....) if the output section isn't much chop, it doesn't really matter anyway!!!! (Yes this may lead to yet another discussion - 'You can't polish a...':cool:)

[Nostalgia] We all have our pet products and for listening, I used to sit and be totally amazed whilst listening to the Alan Parsons Project's 'I Robot' album on UHQR through Stig Carlsson's OA-52's. A long time ago for this old mind, but believe these 2 way speakers were spec'd at 27-20 via a 7" driver and dome tweeter. Amazing design concepts of purity and outstanding fidelity from a great man IMO.[/Nostalgia]

Thank for the post/vid! Had two cups of coffee over it!

Digital Audio Stair-Step myth (or Old Man Yells At Cloud)

Comments