lotus

previous page: 21  So how does MPEG achieve this compression ratio? (MPEG-Audio)
  
page up: MPEG FAQ
  
next page: 23  Before? (MPEG-Audio)

22 Explain this masking effect. (MPEG-Audio)




Description

This article is from the MPEG FAQ, by Frank Gadegast phade@cs.tu-berlin.de with numerous contributions by others.

22 Explain this masking effect. (MPEG-Audio)

OK, say you have a strong tone with a frequency of 1000Hz.
You also have a tone nearby of say 1100Hz. This second tone is
18 dB lower. You are not going to hear this second tone. It is
completely masked by the first 1000Hz tone. As a matter of
fact, any relatively weak sounds near a strong sound is
masked. If you introduce another tone at 2000Hz also 18 dB
below the first 1000Hz tone, you will hear this.
You will have to turn down the 2000Hz tone to something like
45 dB below the 1000Hz tone before it will be masked by the
first tone. So the further you get from a sound the less
masking effect it has.
The masking effect means that you can raise the noise floor
around a strong sound because the noise will be masked anyway.
And raising the noise floor is the same as using less bits
and using less bits is the same as compression. Do you get it?

I don't get it. (MPEG-Audio)

Well, let me try to explain how the MPEG Audio Layer-2 encoder
goes about its thing. It divides the frequency spectrum (20Hz
to 20kHz) into 32 subbands. Each subband holds a little slice
of the audio spectrum. Say, in the upper region of subband 8,
a 6500Hz tone with a level of 60dB is present. OK, the
coder calculates the masking effect of this sound and finds
that there is a masking threshold for the entire 8th
subband (all sounds w. a frequency...) 35dB below this tone.
The acceptable s/n ratio is thus 60 - 35 = 25 dB. The equals 4
bit resolution. In addition there are masking effects on band
9-13 and on band 5-7, the effect decreasing with the distance
from band 8.
In a real-life situation you have sounds in most bands and the
masking effects are additive. In addition the coder considers
the sensitivity of the ear for various frequencies. The ear
is a lot less sensitive in the high and low frequencies. Peak
sensivity is around 2 - 4kHz, the same region that the human
voice occupies.
The subbands should match the ear, that is each subband should
consist of frequencies that have the same psychoacoustic
properties. In MPEG Layer 2, each subband is 750Hz wide
(with 48 kHz sampling frequency). It would have been better if
the subbands were narrower in the low frequency range and
wider in the high frequency range. That is the trade-off
Layer-2 took in favour of a simpler approach.
Layer-3 has a much higher frequency resolution (18 times
more) - and that is one of the reasons why Layer-3 has a much
better low bitrate performance than Layer-2.
But there is more to it. I have explained concurrent masking,
but the masking effect also occurs before and after a strong
sound (pre- and postmasking).

 

Continue to:













TOP
previous page: 21  So how does MPEG achieve this compression ratio? (MPEG-Audio)
  
page up: MPEG FAQ
  
next page: 23  Before? (MPEG-Audio)