Let’s begin by defining what ‘hi-res audio’ is, and then I’ll explain why it’s marketing baloney and no-one should touch it with a barge-pole… and why I’ve just enhanced my various software offerings to work with it anyway!
So, to begin at the beginning: there’s a thing called the Nyquist-Shannon Sampling Theorem. It says that a continuous wave-form can be perfectly reproduced as a set of fixed, discrete samples if the waveform being sampled has a finite bandwidth, and your sampling rate is twice the maximum signal frequency. That is, so long as you can say ‘this audio signal has a fixed upper-limit of (say) 20KHz’, then it is mathematically provable that a sampling rate of 40KHz can capture that wave form perfectly. When the Sony and Philps engineers were developing the Compact Disc audio format in the 1970s, they relied on this theory to determine the characteristics of CD audio. Since the best human ear can really only hear up to 20KHz (and even then, you’ve got to be young and genetically-blessed to hear that high), we can record an orchestra and chop off any part of the audio signal above 20KHz and no-one will be any the wiser: we’re disposing of frequencies no-one of mortal woman born can hear anyway. Then, once we have a continuous audio signal with a firm upper cut-off of 20KHz, we can digitise that by sampling the signal at 40KHz and be mathematically sure of being able to perfectly re-create the original analogue audio signal. Being clever people of the 1970s, however, the Philps and Sony engineers also realised that cut-off filters aren’t linearly perfect. Tell them to cut off at 20KHz, and they’ll maybe kick in a bit early and chop some sub-20KHz signal off, too; they’ll alternatively knock-off a bit early and leave some 20KHz+ signal behind that ought to have been removed. Frequency filters being imperfect, therefore, the CD developers decided to cut a little slack for the filtering process and thus decided to cut off the audio signal at 22.05KHz, rather than at precisely 20KHz. The extra 2005Hz were there to deal with the electronic filtering imperfections of the time. The consequence of that is that for Nyquist-Shannon to remain applicable, the sampling rate had to be twice this higher ‘highest frequency’ – and that’s why CDs have a sampling rate of 44,100 Hertz.
Now, I italicised an important word in that previous paragraph several times, as you probably noticed: perfect. Nyquist-Shannon assures us that absolutely none the input music signal is lost in the digitisation of a music signal to CD format, except that which lies above the 21.05KHz cut-off point… which you can’t hear anyway (and neither can anyone else)! It is in this sense that it is reasonable and accurate to say that you mathematically and provably cannot improve on the audio quality provided by audio CDs. Audio perfection is a function of maths (which doesn’t care how much you spent on your speakers) and biology (which also has no regard to the size of your sub-woofer).
Given those facts, and assuming people were entirely rational, that would be the end of the discussion. But of course, people are people, not rational… and humans at all times and all places have an irrational love for ‘bigger numbers’ or ‘bigger things’. If your car engine has 6 cylinders, and mine has 12, clearly my car is better! And therefore, inevitably, the fact that 44.1KHz at 16 bits gets you an audibly perfect music signal isn’t enough for some people: if their hi-fi can handle 88.2KHz at 24 bits, then whoopdy-do, that must be better than CD audio, right? (Wrong, actually: you can’t get better than perfect). Indeed, why stop there: SACDs were invented
to suck more money from the gullible to get even betterer than better-than-perfect audio by encoding music signals at an eye-watering 2.8Mhz. That’s Mega, not Kilo, so already a thousand times better than perfect!
Mundanely and factually, such high sampling rates cannot improve on perfection, however -and they mean the resulting digital files are enormous compared to those ripped from ‘ordinary’ CDs at 44.1KHz. But the music industry knows a good marketing tactic when it sees one: Moar Hertz!!
Another design feature of the Audio CD was not just the rate at which the music signal was sampled, but the number of ‘levels’ that could be assigned to each sample: the original CD uses 16 bits to describe the audio sample taken 44,100 times a second. The number of bits used is what gives ‘dynamic range’ to CD audio: the fewer bits, the less difference you can hear between a loud sound and a quiet one. So is 16 bits enough to describe a Beethoven symphony with adequate dynamic range? Well, 16 bits gives you 65,536 (that’s 2¹⁶) different possible volume values, sufficent to achieve a dynamic range of around 96dB. In the real world, a mosquito registers around 15dB and a jet engine at take-off registers at around 110dB. So that’s a difference of 95dB, so 16-bit bit-depth is enough to tell the two apart. I think that might also just about cover the difference between a pp violin and a ffff trumpet!
Nevertheless, the music industry knows that bigger numbers are better, so we can now partake of 24-bit audio. That allows 16,777,216 (2²⁴) possible sound levels -which is way more sound levels than your ears are physically capable of distinguishing from each other.
So the short version is that ‘hi-res audio’ (that is, music sampled above 44.1KHz, and recorded with bit depth of anything above 16) is marketing baloney: your ears cannot possibly tell the difference between it and standard CD.
Now, mention that in certain musical circles and people will get very defensive (if not actually quite aggressive!). They spent up big on their hi-fi systems to be able to play hi-res audio, after all; and here you are with your feeble maths and science witchcraft telling them they’ve wasted their money! And thus do flame wars begin…
Anyway, much as I enjoy laughing at people pretending they can hear how good hi-res audio is when I know full-well that biologically and mathematically, they cannot… let me also be swift to say that I entirely understand that hi-res audio can actually bring discernible benefits to music listeners. Those benefits derive not from the higher sampling frequency and bit-depth of these audio formats, though. Instead, and for starters, these hi-res audio releases are often done after extensive remastering of the original recording. Re-mastering essentially means bringing the latest technology to the problem of re-mixing and re-balancing the original recording tapes, to achieve (perhaps) a smoother sound in which the piccolos ring out more clearly than they did on the original mix. Maybe the sound ends up a bit warmer, or richer, too. Often, a re-mastering can leave the finished recording with a greater dynamic range, so that the quiet bits are quieter and the louder bits louder than they were in the original release. So buying hi-res audio might well mean you can hear the difference from the original CD, because you’re buying a lot of new re-engineering/re-mastering work, not just more samples and more bits.
Another thing that is undoubtedly true: SACD and Blu-Ray Audio (to name but two) have, as part of their foundational specification, the use of more than 2-channels. The original CD standard specified only 2-channel stereo; these newer, hi-res formats, can do 6 or more channels for a surround-sound effect. Now, speaking as someone with only two ears, I’m not sure I need surround sound! When you go to a concert, the violins are on your left; the cellos on the right… and the piccolos aren’t behind you! So again, I’m not sure surround sound really enhances a classical music listener’s experience in any meaningful way. But: you cannot deny that a 5+1 surround sound system is going to sound distinctively different to a 2-channel stereo setup. So yes, again, buying hi-res audio may well result in you ‘hearing the difference’ (but not because there are more samples at greater bit depths).
On the other hand, run hi-res audio through a mediocre stereo system and you will definitely ‘hear the difference’: it will sound awful! If you are dealing with an audio signal with higher dynamic contrasts, it’s quite possible your cheap speakers will render that as a set of unnaturally harsh, bright sounds, where the less demanding signals provided by ye olde traditional CDs are rendered more acceptably.
So, there can be remastering and multi-channel benefits to hi-res audio recordings, for sure… and play them on anything less than really good audio kit, and the extra bits and samples can sound awful. So it is definitely true that hi-res recordings can be discerned from their standard CD cousins. It’s just not for the sampling rate and bit-depth reasons people sometimes think make the difference.
And all of that is why I’ve generally shied away from hi-res audio. I’ve a pair of 57 year-old ears that cannot possibly benefit from a higher sampling rate; my hi-fi equipment is middling at best and can’t really be relied on to handle 24-bits of dynamic range comfortably; and my hard disk resources are not infinite (so I prefer smaller audio-perfect files to much, much larger ones: the extra Megabytes are just wasted on me!) I additionally lack the patience to sit there whilst a 300MB file copies to my server, when a 35MB file of the same music extracted at mere-CD sampling rates and bit depths, would have transferred across just fine.
But… but… lately, I have ended up buying some SACDs, because they were the only way I could find to obtain the latest re-mastering of one recording or another. Now, in the past, I’ve taken the SACD signal and ‘thunked it down’ to be standard 16-bit, 44.1KHz, just as it would have been if I’d ripped it from a standard CD. I’m throwing a lot of ‘signal’ away when I do that, but it’s signal my biology rules out me hearing anyway! So I’ve been happy doing that (and probably will continue to do so). But I know a lot of people might not be so blasé about such an approach: if they’ve paid for the extra bits and sampling frequencies, they’d like to know they’re keeping them!
Back in January, I bought myself a new USB DAC (digital-to-analogue converter) which is allegedly capable of playing music with stupidly-high sampling rates -and, somewhat irritatingly- displays the sample rate of whatever music signal it’s fed with. I say ‘irritatingly’ because, once you have a gauge displaying some information, you tend to get a bit hypnotised and hooked on that information (or at least, I do!) I’ve certainly found myself looking at its steady display of ‘44.1’ and wishing it would display something sexier and bigger! That’s especially true when I feed it audio from one of my freshly-acquired SACDs. For example, here’s one I bought recently:
It’s an SACD of a 1960s recording made by Rostropovich and Britten, playing Schumann, Schubert and Debussy cello (ish! – the arpeggione wasn’t actually a cello) works. It was supplied as a single ISO file, which my AUAC tool has been able to decode and decompress quite happily for a long time. With the command auac -i=iso issued at the command line, having previously changed directory to the folder containing the ISO, I get this:
…which indicates that the ISO has been uncompressed, the component music files converted to FLAC (since I didn’t specify an output format, I get ‘CD equivalent FLACs’ by default) and a volume boost has been applied (since SACDs are usually supplied with a ~6dB absolutely peak volume decrease, for technical reasons to do with not being able to handle all those bits and samples perfectly at full volume).
So that’s great, but what happens when I play one of the resulting FLAC files? This:
That’s my Topping E30 DAC declaring that it’s being passed a standard CD audio signal, with a 44.1KHz sampling rate. So somewhere along the line, my nice hi-res audio source has become bog-standard CD audio! Well, to be honest: this isn’t really a surprise. AUAC was deliberately written to output only standard CD quality FLAC files, so naturally when they’re played, they are seen as being of CD quality. The down-conversion from hi-res audio to 44.1KHz is thus ‘a design feature’ of AUAC, not a problem as such, and certainly not a mystery 🙂 But it obviously won’t do if someone wants to extract their SACDs into an actually hi-res audio file format.
So, I’ve relented and added a new feature to AUAC: the ability to add hires as an output format. When AUAC sees that as part of the launch command, it will extract the ISO to its component WAV files as usual -except that it will now output them with a 176.4KHz sample rate (the highest rate that the SACD decoder I’m using can handle). The final FLACs will then also be 176.4Khz/24-bit. (Updated to add on April 7th: AUAC now only outputs 88.2KHz/24-bit FLACs when the ––hires switch is specified as an output format; but there is now also an ––xhires switch will output 192KHz/24-bit extremely hi-res FLACs if you want to push things that far. I have left the original blog piece’s mention of 176.4KHz stand, however: the specific sampling frequency used is less important than beyond-CD-Audio ones are now available from AUAC). If you prefer to do as you have always had to do previously -that is, to down-sample your SACD to ‘standard CD Audio’- then don’t specify hires as an output format; just say ‘flac’ as you have always done previously (or imply it, by not mentioning an output format at all). When a non-hires output format is specified or implied, AUAC will extract the intermediate WAVs at 88.2KHz and then downsample them further during the conversion to the final output.
So let me re-extract my ISO, this time using the command: auac -i=iso -o=hires (that is, input file is an ISO, the output should be hi-res FLAC). Here’s what happens when I subsequently play the extracted FLACs after issuing that new command:
So, now proper 176.4KHz audio is being sent to my DAC (and it’s in 24-bit, but the DAC display doesn’t tell you that). So: AUAC can now do hi-res FLAC conversion, depending on what your source SACD was sampled at. That means it can extract an SACD ISO to hi-res FLAC or standard FLAC; it can also convert (i.e., down-sample) a hi-res FLAC into an ordinary CD-quality one. And, if you are mad enough to want to try, it will also happily convert a CD-quality audio FLAC into a hi-res one (i.e., auac -i=flac -o=hires will work). Such a conversion is utterly pointless, of course: you cannot invent a music signal that is not in the original, so the resulting hi-res FLAC can contain no better music signal than was in the original, but it will be maybe 3 or 4 times the size of the original. Useful, I suppose, if you have shares in a hard disk manufacturer -but otherwise pointless!
Whenever AUAC creates 24-bit FLACs, their file names will contain a ‘-24’ component, so you will know you’re dealing with hi-res files rather than standard ones (which have a -16 added to their name instead).
AMP also gains the ability to play hi-res FLAC files. Unfortunately, this isn’t as easy as it might be thought, because -at least on my home PC- it won’t do it by default. Here’s AMP playing one of my earlier known-to-be-176.4KHz files, taken from that Britten/Rostropovich SACD I mentioned earlier:
The music is coming through as 48KHz, which we know isn’t correct. So what’s going on?
Well, it turns out to be a hardware-related problem. AMP up until this point has played music by using a variation on a command such as ffmpeg -i <file> alsa default. That is, take some music file as input and pipe it out to whatever hardware device the ALSA sound subsystem considers to be the default device. Now, clearly, the audio has made it to my DAC, so that ‘default device’ can’t be completely wrong -but something about its configuration or specification on my system means that the audio stream is being down-converted before it reaches the actual hardware.
This proved surprisingly tricky to track down, in fact, but I finally worked out that if you specify a non-default device, you can potentially achieve better results. In my case, I took a look in the Strawberry music player, which also suffered from the ‘everything displays as 48KHz’ problem. Here’s where you specify what hardware Strawberry is to play to:
See how you can select to play to an ALSA sound card? The program then lets you pick the specific physical device (in this case my Topping E30 DAC) and then… over on the right, it displays the ‘hardware address’ of the selected device. In this case, it’s displaying ‘hw2,0’, because it’s the second sound device on my system (actually, it’s the third, but Linux enumerates them starting at 0, so the third device gets number 2!)
Do you notice in that last screenshot that there’s also a couple of radio buttons, to switch between ‘hw’ and ‘plughw’? I cannot tell you with complete confidence what the difference between the two settings is, but I can tell you that by trial and error, I found out that using the ‘plughw’ option resulted in proper 192KHz playback, where ‘hw’ resulted in the ’48KHz’ display you’ve already seen. Knowing this, I altered AMP to use, essentially, this command: ffmpeg -i <file> alsa plughw:2,0. That is essentially the same play command as before -but now it’s directing the audio output to a specific hardware device, not ‘default’. And the result is that AMP ends up sending a proper hi-res audio signal to my DAC. Having worked that out, I simply needed a way for a user to be able to tell AMP which hardware device to play to: and that has now resulted in another new run-time switch: ––device=xxxx, where ‘xxxx’ is the hardware address of the relevant sound card or device. So, where I previously ran AMP with this command:
amp --musicdir=/sourcedata/music/classical --usedb --dbname=main --selections=3
…I now need to run it with this one:
amp --musicdir=/sourcedata/music/classical --usedb --dbname=main --selections=3 --device=plughw:2,1
If I forget that last ‘device’ switch, all my FLACs still play just fine, but the hi-res ones get transparently down-converted to a 48KHz sampling rate as they are played. They always remain hi-res on disk, but their musical signal is modified as it is transmitted to the sound card. With the new switch present, however, all FLACs play in whatever internal format they happen to be stored. Ordinary CD-quality FLACs are still 44.1KHz, for example; but hi-res FLACs arrive at the DAC in all their high bit rate-and-depth glory.
What device you need to plug into the command will, of course, depend on your particular hardware setup. It can be difficult to work out, though the command aplay –list-devices can give you a clue as to the device number at least:
In my case, I can see my E30 DAC is listed as ‘card 2’, and there are no sub-device numbers to worry about, so I knew I’d have to be dealing with at least something like ‘hw:2,0’. But I wouldn’t have guessed it was plughw:2,0 until I had a look at what worked correctly for the Strawberry audio player. There will be a certain amount of experimentation required, in other words, to get the correct hardware device ID.
I would also want to stress again at this point what I opened this blog piece with: no matter whether AMP sends through the music signal at 48KHz or 176.4KHz, I can’t hear the difference! If there weren’t a big set of numbers on the Topping DAC’s front display, I wouldn’t have had the faintest clue what sample rate my music was playing at, at all! From my ears’ perspective, therefore, I’d simply stick to 16-bit 44.1KHz music files all day, every day! But I know there are some people who will want to know AMP can handle hi-res audio files: and now it can, provided only you can feed it an appropriate hardware device ID, assuming ‘default’ doesn’t achieve the hi-res output required anyway, as it didn’t do in my case.
Don’t forget you can alias complex commands on Linux, so I don’t have to keep remembering to type that complicated-looking device hardware ID. I simply modify my .bashrc so it reads something like:
alias ampr='amp --musicdir=/sourcedata/music/classical --usedb --dbname=main --selections=3 --device=plughw:2,0'
alias ampo='amp --musicdir=/sourcedata/music/overflow --usedb --dbname=overflow --selections=3 --device=plughw:2,0'
alias ampl='amp --device=plughw:2,0'
alias amprefresh='amp --musicdir=/sourcedata/music/classical