[longish] Why I prefer 48 kHz over 44.1 kHz

Discussion in 'Audio Hardware' started by Alice Wonder, Mar 24, 2014.

Thread Status:
Not open for further replies.
  1. Alice Wonder

    Alice Wonder Active Member Thread Starter

    Location:
    Redding, CA
    44.1 kHz vs 48 kHz

    Okay, let me start by stating I do not believe there is an audible difference between the two that adults can hear. Children maybe, but not adults.

    Children may be able to hear a difference with audio that goes up towards 20 kHz tones because anti-aliasing filters are not perfect and have an impact on frequencies close to but below Nyquist. With 48 kHz Nyquist is 24 kHz and with 44.1 kHz Nyquist is 22.05 kHz so with respect to 20 kHz that some children can hear, a 48 kHz sample rate gives a little more headroom for the ADC anti-alias drop-off below Nyquist to only affect frequencies above 20 kHz.

    But for adults I don't think it matters.

    Secondly, for playback I believe the super high definition frequencies are a waste. Digital Signal Processing is science and a properly functioning DAC will produce perfect waves from data below Nyquist.

    In simplest terms, think of polynomials. Given any two sample points, if you know it is a line you can perfectly reproduce an analog version of the line. More won't allow you to produce a "better" line, you have enough information to make a perfect analog line.

    Any three sample points, if you know it is a parabola, you have enough data to make a perfect analog parabola. Or if you know it is a circle, a perfect analog circle.

    Digital waves are not lines or parabolas or circles but the point is you can take discrete sample and create perfect analog representations, and with digital signal processing, what we need to be able to make a perfect analog signal is a sample rate at least twice the highest frequency we wish to reproduce. With a little headroom for issues like imperfect anti-aliasing filters that remove the frequencies we don't want to reproduce.

    96 kHz or 192 kHz sample rates have value in the mastering process but not for playback within our audible range.

    The superior sound of vinyl that many of us (myself included) enjoy is the result of artifacts from the medium, artifacts that are pleasing to our ear. They can now allegedly be reproduced in digital with filters, though I suspect the slight variances that happen during vinyl playback may be part of the magic of vinyl.

    With respect to 16 bit vs 24 bit audio, I do believe some people may physically be able to tell the difference but I'm guessing only with the volume turned up so loud that it would damage your ears and you wouldn't be able to tell the difference for long. At typical listening volumes, the noise floor of 16 bit is below what we can hear already.

    But back to the purpose of this thread, 44.1 kHz vs 48 kHz. There is not an audible difference in my opinion, but I believe that unless you are mastering an audio CD, digital audio should be mastered to 48 kHz for playback. Here's why.

    When the DSP in your computing device mixes audio, they need to be at the sample rate. Digital audio at a different sample rate has to be re-sampled.

    I don't know how many of you build your own PCs, I build my own even though it is more expensive because I like to pick goes into it. In the old days, when building a PC there was an audio cable that went from the CDROM to the sound card. The reason for this, in the old days computer sound cards could only process 48 kHz PCM data, so 44.1 kHz PCM data would have to be re-sampled and that would take processing power. So the CDROM drives had an audio cable feeding analog audio to the sound card for playback, avoiding the need for re-sampling.

    Nowadays, most sound cards can do 44.1 kHz and re-sampling doesn't tax the system as much, so that audio cable is not needed.

    But if your sound-card is set at 48 kHz, those 44.1 kHz audios are re-sampled every time they play.

    Many sound cards switch between the two. If nothing is playing, and a 44.1 kHz audio is played, then the sound card will operate at 44.1 kHz and not have to re-sample. If a 48 kHz audio is played, it will operate at 48 kHz and it does not have to re-sample. This usually works well, but switching sample rates it is operating at can be a source of problems if the drivers are buggy.

    The 44.1 kHz sample rate really was only ever used for Audio CD. It was also used for digital downloads because some people liked to burn audio CDs of their digitally purchased music, so 44.1 kHz meant no re-sampling was needed.

    People don't burn audio CDs nearly as much now, so I don't see a reason to keep using 44.1 kHz.

    Digital video is on the increase and digital video uses 48 kHz sample rate, I think digital audio should follow suit and use 48 kHz as well. If we phase out 44.1 kHz, sound-cards will pretty much only need to deal with one sample rate and switching between the two won't be needed. Occasionally re-sampling 44.1 on the fly would be required of a sound card set to only operate at 48 kHz but that is not difficult to do today. Buggy drivers that crash when a sound-card switches aren't an issue if it never switches.

    This is actually something I do when I rip a CD. I rip to a single file, and while I do archive that in flac as ripped, I also then re-sample to 48 kHz and after the re-sample, split the tracks.

    There isn't an audible difference, when I first started doing this, I did re-sample loop - Source CD -> 48 kHz -> 44.1 kHz -> 48 kHz etc. 100 times and I was not able to hear any differences between the final result and the the original 44.1 kHz. If doing it back and forth 100 times on my test loop (Struntz & Farah - Primal Magic) did not produce an audible difference, then doing it once isn't going to cause an audible difference on any music, even if some players re-sample back to 44.1 kHz (e.g. playing it on a smart phone).

    Interestingly, the developers of the Opus codec seem to have the same philosophy - 44.1 kHz is not a native sample rate for Opus, if you rip an audio CD to opus it will re-sample it to 48 kHz while encoding.

    So nutshell, it is my opinion it is time to phase 44.1 kHz out. Keep things simple for the sound-card, keep playback set to 48 kHz and re-sample the deprecated legacy sample rate when it is played. For new recordings, record to 48 kHz. For ripping CDs, the CD has to be 44.1 kHz on the CD but for lossy encoding, re-sample to 48 kHz either manually or by using a codec like Opus that does it for you.

    That's my opinion. Of course do what you want, it is just my opinion. In reality it probably is not a big deal, I'm famous for over-thinking things, but I like KISS and while re-sampling my CD rips may not appear KISS on the surface, keeping my sound-card at 48 kHz is KISS.
     
    Last edited: Mar 24, 2014
    SandAndGlass and kevintomb like this.
  2. gloomrider

    gloomrider Well-Known Member

    Location:
    Hollywood, CA, USA
    As someone who primarily ventures outside of 44.1kHz for recording vinyl, I have indeed made 48kHz/24 bit files for playback in "video centric" environments.

    But I don't think you'll get a multitude of enthusiastic backers of an initiative to phase out 44.1kHz audio. Portable players would be the first thing that comes to mind. And CD players are still surprisingly ubiquitous. When I record vinyl for friends, I almost always deliver a burned CD, rarely files (and even more rarely files > 44.1kHz) on a USB stick.

    Even Ethan Winer reluctantly admits that some people (I assume he means adults) can hear when audio has been resampled.

    And regarding Opus, it has an adoption rate a fraction of the size of Ogg Vorbis.
     
    Vidiot likes this.
  3. Alice Wonder

    Alice Wonder Active Member Thread Starter

    Location:
    Redding, CA
    Opus adoption rate I believe is small because it is new. For streaming it has significant advantages over every other common codec due to its extremely low latency. Whether it is only used for streaming (e.g. WebRTC) or becomes common in stand-alone remains to be seen, unfortunately if Apple chooses to not support it in iTunes it will probably never catch on.
     
  4. (life w/out milk)

    (life w/out milk) New Member

    Location:
    Phoenix, AZ
    It would appear that Apple has no reason at all to go with Opus. They already use and support HE-AAC V2 which is close enough. Opus being unencumbered by patents doesn't matter either, look at ALAC, they chose to create ALAC even though FLAC existed already.

    I believe they would rather be safe in knowing they're protected by the patents surrounding AAC than togamble getting sued for possible infringement for Opus, Ogg Vorbis, and etc.
     
  5. L5730

    L5730 Forum Resident

    Folks were trying to get CD to go to 48 kHz from the beginning, and they got pushed out/ignored. It would have been a better idea to use 48 kHz to reduce the steepness of the AA filter. Seems a shame they didn't do it.
     
  6. rbbert

    rbbert Forum Resident

    Location:
    Reno, NV, USA
    44.1 was necessary in order to use the hardware available around 1980. There was a sidebar in The Absolute Sound about a year ago explaining this.
     
  7. Jim T

    Jim T Forum Resident

    Location:
    Mars
    I can't hear the diff between 48 & 44.1, but I sure can hear the diff when jumped to 24/96.
     
    LeeS likes this.
  8. beppe

    beppe Forum Resident

    Location:
    Venice, Italy
    Me too
     
  9. Grant

    Grant Now let that bass fall in! Oh yeah!

    Location:
    United States
    Hello again, Alice.

    Good idea is to never generalize. Some adults can indeed hear differences, but it requires learning what to listen for. When you go down from, say, 96k, or 88.2 to 44.1, you do hear the difference.

    Now, i'm with you that I can't hear hardly any difference between 48k and 44.1, and, indeed I do most of my work at 44.1, even if I keep it at hi-rez in the end. So, I usually record at 32-bit float and 44.1. Once in a while, if I feel it is really warranted, i'll do the drop at 88.2k or 96k. I never bother with 48k.

    I'll work at and save the master at 24-bit/44.1k, and create the 16-bit/44.1 FLAC and mp3 for listening and for the server/car. Sometimes, i'll just do the entire drop at redbook.
     
    c-eling likes this.
  10. kevintomb

    kevintomb Forum Resident


    I think even more importantly, it is not just hearing acuity, or ability to know what to listen for, but the variability built into all humans as to what actually matters or does not matter to each individual.

    Some material, is slightly degraded or changed and some is not also. Not everything benefits or is degraded by changes in resolution. In fact I have some music, that apparently sounds quite well recorded to me at least, that when "Degraded" down to a few different MP3 compressions, still sounds quite good.

    I have found more than anything, nothing can be generalized. Stuff I though would suffer, doing MP3 experiments, sometimes does not at all, and other stuff has noticeable artifacts, no rhyme or reason on some things.

    The only thing I have found, a great recording tends to sound great on most resolutions, and sound schemes.
     
  11. tim185

    tim185 Forum Resident

    Location:
    Australia
    Recording at higher sample rates, for me 96khz, has the oft over looked advantage of halving the latency in the DAW world. A not unimportant factor when tracking direct guitars, vocals etc etc.
     
    L5730 likes this.
  12. darkmass

    darkmass Forum Resident

    A common enough misunderstanding. However, the original theorems were based on the signals existing over infinite time intervals. Even "Free Bird" falls slightly short of meeting that requirement. :)

    This may be some worthy reading for you...dense, most certainly, but there is a hint of honesty to it. http://www.thinkmind.org/download.php?articleid=sysmea_v2_n1_2009_1

    As an indication, here are three paragraphs from page 12 (of 17)...

    We should point out, like in Theorem 2, that if we assume infinite time interval then faster than the Nyquist rate will also not give redundant information. This concept is also easily seen from the Fourier series expression (9). To solve for the coefficients of (9) we need infinite number of samples to form a set of simultaneous equations similar to (30). As we increase the sample rate the solution of (30) will only become better, that is, the resolution of the coefficients will increase and the unknown function will also get better approximations.

    For finite time assumption higher sampling rate is necessary to achieve the desired accuracy. The reason is same; the concept of infinite dimensionality must be maintained over finite time interval. That can be achieved only by higher sample rate. We also repeat, if you know the analytical expression then the number of samples must be equal to the number of unknown parameters of the analytical expression. This case does not depend on the time interval.

    A lot of research work has been performed on the Shannon’s sampling theorem paper [2]. Somehow the attention got focused on the WT factor, now well known as the dimensionality theorem. It appears that people have [16][17] assumed that T is constant and finite, which is not true. Shannon said in his paper [2] many times that T will go to infinite value in the limit. No one, it seems, have ever thought about the finite duration issue. This is probably because of the presence of infinite time in the Fourier transform theory. The paper [15] gives a good summary of the developments around sampling theorem during the first thirty years after the publication of [2]. Interestingly [15] talks briefly about finite duration time functions, but the sampling theorem is presented for the frequency samples, that is, over Fourier domain which is of infinite duration on the frequency axis. Now we give a numerical example to show how higher rate samples actually improves the function reconstruction.


    And, three paragraphs from the conclusion...

    We have given various proofs to show that k times, k>1, the Nyquist sample rate is necessary to improve the accuracy of recovering a function that is available only over finite time measurement window. We have shown that this k can be selected based on the required accuracy estimate ɛ.

    The foundation of our derivations used the infinite dimensionality property of the function space. The concept essentially means that an infinite number of samples are necessary to precisely extract all the information from a function.

    We have pointed out that many of our existing definitions and theories depend on the infinite time assumptions. We should systematically approach to eliminate this requirement from all our theories to make them realistic for our engineering problems.


    As a mathematical concept, epsilon (ɛ) is never equal to zero over finite intervals (and certainly not less than zero). However, what the linked paper demonstrates is that for whatever tiny epsilon is chosen, it can be attained by a sufficiently high multiplier of the fundamental Nyquist rate. That is, a chosen sample rate (and "properly functioning DAC" for converting that rate) achieves some engineering desired level of good enough...never "perfect". Of course, based on that, the Nyquist rate itself cannot achieve perfection.

    Tying that all into your 48 kHz... That is a slightly better than Nyquist good enough. It is good enough for you, can't argue, but there's a fundamental premise problem...and, of course, each individual set of ears has its own "good enough" level.
     
    jukes and Ham Sandwich like this.
  13. Jim T

    Jim T Forum Resident

    Location:
    Mars
    I have many 9th graders who are not proficient in single digit multiplication tables ( why, in 2014, is the real problem?) so trying to get them to worry about any of this, when their phone play MP3s just fine, is not happening. Even the diff between redbook and 256K doesn't matter to them. Can their Beats not play clearly enough to expose the improvements?

    When I taught in 6th grade I had a parent ask me for an answer key! I was watching the baseball game on TV and no one knew how to even begin to determine the time it takes for a pitched baseball at a certain MPH to reach home plate (The distance is 60.5 feet) . They probably can't determine an ERA either.

    I still don't understand why anyone would track/record at 48K in 2014. If one cares about real quality nothing below 2496 should be good enough for them. HD space is just too cheap and at least your tracking and masters would be good enough, that if you did produced a HIT, when it was time to put out re-mastered material you had something to really work with. If your end goal is to go to 16/44.1 any way it makes no sonic difference with the software available today to do 48khz over 96 khz for resampling and bit depth changes.

    I have been listening to the 4 free tracks of 2496 music offered for free of OceanWay Audio's web site produced by the great Alan Sides and can't imagine why anyone would want to listen to this at less than 2496. I used Sony Sound Forge to reduce it to a redbook files and clearly the space, separation between instruments, and the smoothness were lessened by doing that. It was obvious even to my very old ears.

    On the OceanWay Audio page at the top, far right, click "Special Download" and enjoy.
     
  14. ronankeane

    ronankeane Forum Resident

    Location:
    Dublin, Ireland
    Thanks for linking that; it's very interesting. I've thought about this quite a lot over the last couple of years, since I read about the Nyquist theorem, without coming to any clear conclusions. One thing that strikes me is that a finite-time signal can easily be extended to an infinite-time signal by regarding it an infinitely repeating in the past and future. This makes a signal which is not time-limited but essentially contains only the information contained in the time-limited signal. I've only had a chance to skim read the paper but that possibility doesn't seem to be addressed there. It also looks like there is a logical error in the first part of Section 5 (before the theorem) but I'd have to read it more carefully to be sure.

    More fundamentally, I wonder about the disconnect between the theory and our hearing. Mathematically, it is impossible to have a signal that it both time-limited and bandwidth-limited. What does that mean for human hearing? I perceive sounds as time-limited and experiments indicate that human heaing is bandwidth-limited. So what gives?
     
  15. ronankeane

    ronankeane Forum Resident

    Location:
    Dublin, Ireland
    Well I read the paper and I think it's full of faulty logic.
     
  16. Metralla

    Metralla Joined Jan 13, 2002

    Location:
    San Jose, CA
    Interesting paper. Just before section 5

     
  17. darkmass

    darkmass Forum Resident

    Any specifics?
     
  18. darkmass

    darkmass Forum Resident

    Is there a disconnect between the theory and our hearing? What we humans do is perceive a signal through our personal temporal and bandwidth "window". Our personal windows cannot modify a signal's actual nature. Of course, if you want to progress along such philosophical/physics lines, you might be better off giving attention to Quantum Theory, Schrodinger's cat, and "the observer always modifies the event" (which I tend to think is quite cool...and true). Not that there isn't a future for Quantum Signal Theory.

    But existing signal/sample theory, maybe more exactly practice, seems to care about human perception only to the extent of setting some "good enough" parameters.

    Now best as I can tell, Signal Theory states not only that a signal cannot be simultaneously time-limited and bandwidth-limited, it states that a signal must be time-limited or bandwidth-limited. As a consequence, if your "finite-time signal can easily be extended to an infinite-time signal by regarding it an infinitely repeating in the past and future" works as you believe (and there is an appeal to that), then at the same time it must force the constructed signal into a state of bandwidth limitation. I'll leave the demonstration of that to the student. :)
     
    Last edited: Aug 8, 2014
    jfeldt likes this.
  19. ronankeane

    ronankeane Forum Resident

    Location:
    Dublin, Ireland
    In Section 5 their way of "proving" that higher sample rates are not redundant is by demonstrating that, for a particular sampling/reconstruction approach, the results get more accurate as the sample rate is increased. However, all they have proved is that this particular approach is not optimal. A more efficient approach to reconstruction (as in Shannon's proof) will get you the entire function once the sampling rate exceeds Nyquist and beyond that the extra samples are redundant.

    This relates to a reconstruction using step functions. If applied to a straight line, it would approximate the straight line by a step function and as the sample rate increases, the steps would get smaller and smaller and the output would get closer and closer to the actual straight line. They mention, in a few different places throughout the paper, that if you know the form of the input signal (e.g. sine wave, straight line) then fewer samples are needed. But they don't see how the assumption of band limiting, while not as strong as knowing the form of the input, is enought to make their approach inefficient and therfore not a good measure of when extra samples are redundant.

    If their "proof" was correct then they could construct a contradiction of Shannon's theorem. They actually say as much on page 12: "We should point out, like in Theorem 2, that if we assume infinite time interval then faster than the Nyquist rate will also not give redundant information." This contradicts Shannon, although they say earlier in the paper that they don't mean to challenge his proof.
     
  20. ronankeane

    ronankeane Forum Resident

    Location:
    Dublin, Ireland
    I'm not aware of the principle you're referring to. Is it an engineering principle? I'm coming at this from a pure maths perspective. For example, the function f(x) = x^2 (defined on the set of real numbers) is neither band limited nor time limited.
     
  21. darkmass

    darkmass Forum Resident

    Time for me to get back to this. My apologies, I've been otherwise engaged.

    My apologies for this statement as well, "it states that a signal must be time-limited or bandwidth-limited", even if I did preface it with a slight disclaimer. I will confess that the arena of digital signal processing is out of my wheelhouse. (My career, while truly technical, not this domain. My education included a class in Hilbert Spaces, the one thing that was close, but that was a fair while ago.) I am engaged in adding to my fund of knowledge. That statement was one I encountered in my research, but I've been unable to relocate it. Nevertheless I believe what I read was in error. I also saw it supported in another location, but that support came by way of what I think was an inappropriate graphical convention (a double-headed arrow), that meant to me, if and only if, when it was intended to mean only a single direction "implies". Mea culpa...but I learn. If I misspeak, and that is always possible in any learning sequence, I will learn.
     
  22. darkmass

    darkmass Forum Resident

    First of all, though I expect you are familiar with Shannon's paper, the general forum inhabitant may not be. A PDF is located here: http://nms.csail.mit.edu/spinal/shannonpaper.pdf

    I should also state that from my perspective Das, Mohanty, and Singh (for the general forum, the link to their paper is provided in post #12, above) indeed have no intention of contradicting Shannon, and they don't...rather they are taking on a slightly different aspect of the same overall mathematical space. Of course, we have their e-mail addresses as provided in their paper, they could be invited in for a look...or even a comment. :)

    Das, et al, are looking at finite duration continuous signals--a type commonly the case for musical information. One thing their paper illuminates are the somewhat hidden mathematical infinities built into Shannon's paper. There is nothing incorrect about these infinities, Shannon's work is good work. But if the infinities are not accounted for there will be misunderstandings about the underpinnings of the digital sampling based music that is a large part of these fora (see the OP's statement I highlighted in my initial post). In general, "perfect" sampling/reconstruction is only mathematically possibly for infinite duration signals.


    Now Shannon says (his page 457, Section XII. Continuous Sources):

    "Fortunately, we do not need to send continuous messages exactly. A certain amount of discrepancy between the original and the recovered messages can always be tolerated. If a certain tolerance is allowed, then a definite finite rate in binary digits per second can be assigned to a continuous source. It must be remembered that this rate depends on the nature and magnitude of the allowed error between original and final messages. The rate may be described as the rate of generating information relative to the criterion of fidelity."


    You state:

    "In Section 5 their way of 'proving'.... However, all they have proved is that this particular approach is not optimal. A more efficient approach to reconstruction (as in Shannon's proof) will get you the entire function once the sampling rate exceeds Nyquist and beyond that the extra samples are redundant."

    Shannon's (Section XII) Theorem 5, indicates that for a signal with a continuous source the desired criterion of fidelity (N1) leads to a bounding of information rate in bits per second. Looking at Shannon's equation 44, if N1 goes to zero as a limit, the required information rate goes to infinity as a limit. Das, et al, are saying in a time limited duration, increasing sample rates bring you closer to the original source continuous function--that is not redundancy, not if you can more closely approach the original signal. In Shannon's Section XII, is he writing with reference to a finite duration signal? An infinite duration signal? Let's say it's finite duration, he himself shows that increasing sample/information rate leads to an increase in fidelity to the original signal--that too is not redundancy. If we say it's an infinite duration signal, well that seems to be something that can be perfectly reconstructed (though we might get tired of waiting) and perfectly constructed by a sampling rate beyond Nyquist. That's agreeable. But. We are talking about an infinite duration. The original "beyond Nyquist" sample rate lead to an infinite number of samples. Double that rate, and there is an infinite number of samples. Unless I misunderstand "an infinite number of samples" and the concept of infinite duration, if an infinite number of samples is not redundant...twice, say, the original number of samples induces no redundancy. (But recall, Das, et al, are concerned with finite durations, so the infinite case is somewhat extraneous.)

    Oh, I think the "straight line" comment you make doesn't quite pertain. A straight line is, at best, a special case of a continuous signal. The roots of calculus involve an approach not so very divorced from the approach of Das, et al, (and, if this works, you should see large elements of their approach in the final section here: http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CB0QFjAA&url=http://link.springer.com/content/pdf/10.1007%2FBF03322936.pdf&ei=lcrvU6XIEYjwoATFoIKYAw&usg=AFQjCNFw4gMGJqgYMFEVLQhKM13ckLgpcQ&sig2=XiudbdEASiPdntL3wT28GQ ). If a person is looking for the area under a bounded region of the function f(x) = 5, calculus may be a bit much, but that does not invalidate calculus.

    Now a Nyquist limit is a useful thing. But, ultimately, with real systems, any such limit is an engineering specified limit related to Shannon's "criterion of fidelity". Yes, because of aliasing concerns, nothing above Nyquist can be meaningfully sampled/reconstructed (leading to necessary low pass filtering), but below Nyquist reconstructed signals still do not have perfect fidelity to the original source (with a finite duration signal). Raising the sampling rate (and opening up the low pass filter in agreement), results in an improved "criterion of fidelity". However, it should be stated that any 44.1k/16 "objectivists" might still consider the better fidelity somehow redundant information. Ah, but then those folk are not actually science based, are they.
     
    Last edited: Aug 16, 2014
    jfeldt likes this.
  23. Ghostworld

    Ghostworld Forum Resident

    Location:
    US
    I was perfectly happy with gear that had a frequency response of 40-15,00o khz. The rest is a waste! :)
     
    missan likes this.
Thread Status:
Not open for further replies.

Share This Page