Is the 192kHz / 24bit high-resolution uncompressed sound source really audible?


By

emmolos

The number of devices compatible with high-resolution sound sources , which are said to have better sound quality than before, is increasing, and the range is expanding beyond music players to smartphones. The world of high resolution is booming, but on the other hand, it is also true that there are voices saying ' I didn't really understand the difference' even though I actually listened to the sound. A blog that discusses from a technical point of view whether the sound really changes with high resolution has been released.

24/192 Music Downloads are Very Silly Indeed
http://xiph.org/~xiphmont/demo/neil-young.html

Chris 'Monti' Monti Merry, who advocates his own theory about the spread of high-resolution audio sources. Mr. Montgomery, who is also the person who developed the license-free audio compression format ' Vorbis ' that anyone can use, developed the theory from the following viewpoints with his specialized knowledge as the background, and made it a high resolution sound source that unnecessarily increases the file size. On the other hand, he advocates unnecessary theory.
・ Harmful effects of high rate sampling
・ Difference between 16-bit and 24-bit
・ Results of listening comparison test by 'blind test'



◆ Harmful effects of high-rate sampling Part 1: The audible range of the human ear and the playback range of high-resolution audio sources
The reason why high-resolution sound sources are said to have high sound quality is that they can play higher sounds than CDs. Generally, the standard of high resolution is regarded as 'sampling frequency 96kHz or more', and according to the definition of Japan Audio Society, ' sound of 40kHz or more can be reproduced (PDF file)', and JEITA (Japan Electronics and Information Technology Industries Association) etc. The high-resolution sound source, which is defined as 'digital audio (PDF file) that exceeds the CD specifications ', can theoretically record and play high-resolution sound that far exceeds 20 kHz (audible range) that can be heard by the human ear. , Montgomery is developing a theory that cuts off the sound, saying, 'There is no need for sounds above 20kHz that are inaudible to the human ear.'

Montgomery argues that this theory should be 'thinking about how researchers from the past came to the conclusion that the audible range is 20Hz to 20kHz.' The table below plots the loudest and softest sounds that the human ear can hear by frequency, showing that they are almost imperceptible above 20kHz, especially in the high frequencies.



◆ Harmful effects of high-rate sampling # 2: 192kHz adversely affects the sound quality of equipment
In addition, Montgomery asserts that the high-resolution sound source, which enables the reproduction of high-pitched sounds that cannot be heard by the human ear, causes 'distortion' of the sound in the audio equipment to be reproduced, and 'there is no merit'.

While it is impossible for all audio equipment on the market to escape the effects of 'distortion' that causes muddy sound, conventional audio equipment successfully avoids distortion in the required frequency band. I've done it. However, when dealing with high-resolution sound sources that reproduce even higher sounds, it is said that problems that cannot be avoided as before will be exposed.

In principle, audio equipment is said to have low distortion resistance to ultra-bass and ultra-treble components, and when the ultra-high-frequency components from a high-resolution sound source are input to the audio equipment, the sound may become muddy as soon as possible. Pointed out Montgomery. In the graph below, when there is a large input at 30000Hz (30kHz / red) and 33000Hz (33kHz / green) outside the audible range shown in light blue, extra distortion components (overtones / blue) are generated due to

intermodulation distortion. It is shown that it is occurring, and it is said that this distortion component has the harmful effect of causing turbidity even in the sound in the light blue audible range.



Mr. Montgomery points out the following points as 'there are four solutions to avoid this problem', and among them, 'only the fourth has the actual effect'.

1: Add dedicated super treble speakers, amplifiers and crossovers to your system to completely separate the super treble components and make them unaffected.
2: Introduce audio equipment that can withstand ultra-high frequency components. However, this is costly and may affect the sound quality in the normal range.
3: Carefully design the circuit of the device so as not to reproduce the ultra-high range.
4: Make sure that the original file does not contain super-high-pitched components and that problems do not occur.


As an example of intermodulation distortion, Mr. Montgomery provided the following 5 types and 6 sound source files on the site. All of them are high-pitched files of 20kHz or more, but depending on the playback device, you may hear faint modulation distortion noise.

・ 30kHz tone + 33kHz tone (96kHz / 24bit) [ 5 seconds ・ WAV ] [ 30 seconds ・ FLAC ]
・ 26kHz-48kHz warbling tones (96kHz / 24bit) [ 10 seconds ・ WAV ]
・ 26kHz-96kHz warbling tones (192kHz / 24bit) [ 10 seconds ・ WAV ]
・ Song clip constituting up by 24kHz (96kHz / 24bit) [ 10 seconds ・ WAV ] (Music file converted to ultra-high range of 24kHz or higher ・ Almost silent)
・ Original version above (44.1kHz / 24bit) [ 10 seconds ・ WAV ] (actual sound is produced ・ Volume attention)



◆ Harmful effects of high-rate sampling Part 3: Noise and oversampling due to sampling
Digitizing an analog sound source is the process of dividing the waveform of a sound into very fine times and digitizing (sampling / sampling) it. The figure below simply shows the digital data (red) obtained by sampling the original waveform (light blue), but the digital data chopped on the time axis is inevitable for the original smooth waveform. You can see how it is inevitable that it will be jagged.



Furthermore, when it comes to high-pitched waveforms (fine waveforms), it becomes difficult to accurately reproduce the original waveform. The higher the sound, the more likely it is that problems with the waveform and sample timing will occur, and in the figure below, you can clearly see the state of digital data that is far from the original waveform.



The jagged waveform creates unnecessary high-pitched components (overtones) that were not found in the original sound, giving the sound extra turbidity ( folding noise ). In order to remove this noise, a low-pass filter , which is a type of equalizer, is passed before digital conversion, but the 'anti-aliasing filter ' used at this time tends to be avoided because it adversely affects the sound. 'Oversampling ' is the method used to solve such problems.

To put it simply, oversampling achieves a smooth waveform by sampling at 96kHz or 192kHz, which is faster than the sampling frequency of 44.1kHz as in the CD standard, suppresses the generation of wrapping noise, and results in an antialiasing filter. To suppress the adverse effects of. It is no exaggeration to say that this method is now used in almost all audio equipment because it is a simple and efficient method that can obtain good sound quality simply by increasing the sampling frequency.



Oversampling has been fully utilized especially in the field of recording because the sound quality can be secured, but Mr. Montgomery said, 'As a result, when it is reduced to the sound quality of CD etc., it will be converted to 44.1 kHz as before. It doesn't make much sense. ' Quantization noise cannot be avoided in principle as long as the 'PCM method' sampling mentioned above is performed, so this should not be a problem limited to high-resolution audio sources. In that sense, some people may have some doubts about this part of Mr. Montgomery's point.

◆ Difference between 16-bit and 24-bit
In addition to the sampling frequency, the number of bits ( bit depth ) is the key to high resolution. The merit obtained by increasing (deepening) the number of bits is 'expansion of dynamic range', which means that the difference between quiet and loud sounds can be expressed widely. By expanding the dynamic range, even small sounds can be heard clearly, and the resolution of the sound is increased to make it a good sound, which is said to improve the sound quality by increasing the bit, but Mr. Montgomery said this. I am also questioning the part.

Mr. Montgomery said, 'It is true that a 16-bit linear PCM sound source cannot cover the entire dynamic range that humans can theoretically hear,' but high-bit technology is mainly due to its merits in the 'recording field.' Claimed to have been introduced in.

The maximum volume that a healthy young man can withstand is said to be 140 dB , while the dynamic range that a 16-bit sound source can reproduce is 96 dB , which does not cover human abilities in terms of specifications. On the other hand, the superiority of the 24-bit sound source with a dynamic range of 144 dB is clear, but what Montgomery is concerned about here is the viewpoint of 'actually required dynamic range'.

At the source site, you can audition a file containing a 1kHz test tone with a theoretical maximum volume of 0dB and a very low level of -105dB, which is a little wider than a 16-bit sound source. At the first 0 dB, you can hear the sound clearly, but at the second -105 dB, you can see that the sound that should have been reproduced on the data is almost inaudible. From this, Montgomery argues that 'generally, a 16-bit dynamic range is sufficient for playback,' and again, a high-resolution audio source is unnecessary. (Click the image to play the original sound file)

1kHz tone at 0 dB (16 bit / 48kHz WAV)



1kHz tone at -105 dB (16 bit / 48kHz WAV)



However, even Mr. Montgomery admits that 'multi-bit is effective' at the recording site . The reason for this is the safety that 'a margin is provided so that the recording will not be distorted even when a loud sound is unexpectedly input, and conversely, if the recording level is too low, the sound will not be buried in the noise.' Is listed. It is an allegation that an engineer who is particular about sound may be overwhelmed when he hears it, but since it is based on scientific theory, it seems to have a certain degree of persuasive power.

◆ Results of listening comparison test by 'blind test'
Ultimately, it's still best to get human judgment as to whether or not the sound has actually improved. Mr. Montgomery verified the results of multiple listening comparison tests that were actually conducted. When we conducted a test comparing high-resolution sound sources and CD-quality sound sources, we found that the correct answer rate in 554 tests including audiophiles and professional engineers was 49.8%, which was slightly less than half.

BAS Experiment Explanation page --Oc 2007
http://www.bostonaudiosociety.org/explanation.htm



By the way, according to the above Boston Audio Society, the following sound sources were often used for comparison.

Patricia Barber – Nightclub (Mobile Fidelity UDSACD 2004)
Chesky: Various --An Introduction to SACD (SACD204)
Chesky: Various --Super Audio Collection & Professional Test Disc (CHDVD 171)
Stephen Hartke: Tituli / Cathedral in the Thrashing Rain; Hilliard Ensemble / Crockett (ECM New Series 1861, cat. No. 476 1155, SACD)
Bach Concertos: Perahia et al; Sony SACD
Mozart Piano Concertos: Perahia, Sony SACD
Kimber Kable: Purity, an Inspirational Collection SACD T Minus 5 Vocal Band, no cat. #
Tony Overwater: Op SACD (Turtle Records TRSA 0008)
McCoy Tyner Illuminati SACD (Telarc 63599)
Pink Floyd, Dark Side of the Moon SACD (Capitol / EMI 82136)
Steely Dan, Gaucho, Geffen SACD
Alan Parsons, I, Robot DVD-A (Chesky CHDD 2003)
BSO, Saint-Saens, Organ Symphony SACD (RCA 82876-61387-2 RE1)
Carlos Heredia, Gypsy Flamenco SACD (Chesky SACD266)
Shakespeare in Song, Phoenix Bach Choir, Bruffy, SACD (Chandos CHSA 5031)
Livingston Taylor, Ink SACD (Chesky SACD253)
The Persuasions, The Persuasions Sing the Beatles, SACD (Chesky SACD244)
Steely Dan, Two Against Nature, DVD-A (24,96) Giant Records 9 24719-9
McCoy Tyner with Stanley Clark and Al Foster, Telarc SACD 3488

In addition, many similar verifications have been carried out in Japan, and it seems that surprising results are often obtained.

Can high-resolution audio sources be heard by the human ear? Verify with a forbidden blind test! (1/6) --Phile-web
http://www.phileweb.com/review/article/201311/06/982.html

◆ Advice: How to enjoy really good sound
Based on these verifications, Montgomery suggests several pieces of advice.

・ Use good headphones
One of the most effective methods is to use headphones with good sound quality. Any type of earphones, such as open type, closed type, and even inner type earphones, can be used for this purpose, so it is important to avoid products of the type that 'pay for the brand and design' and select products with quality that is commensurate with the price. It is said that.

・ Choose a lossless sound source
Lossy compressed files such as MP3 and AAC, which are widely used, have the sound components thinned out to some extent when making the file compact. In that respect, FLAC sound sources and uncompressed WAV sound sources can be ensured to be the best at least at the sound source stage.

・ Get a better master sound source
No matter how good the file format is selected, if the quality of the original sound source is low, the potential cannot be utilized. It is necessary to find works that sound good, and if possible, seek the highest quality possible, such as the 'remastered version', in order to enjoy good sound.

◆ Summary
It was Mr. Montgomery's 'high resolution unnecessary theory' which can be taken as a little extreme theory, but it seems that there is a part that is convincing. In this content, the topic was PCM high-resolution audio sources, but the mechanism is completely different from this, and the DSD method (or SACD: Super Audio), which is said to be an 'analog-like' mechanism while handling digital data, is used. It can be said that the existence of CD) should not be forgotten. As Montgomery's advice suggests, in order to meet good sounds, listen to a lot of high-quality recorded sound sources to hone your ears and sensibilities, and say 'really good sounds' without being confused by the various voices around you. It can be said that there is no choice but to cultivate an aesthetic eye that can see through something.

in Note,   Hardware,   Science, Posted by darkhorse_log