Jump to content
Sign in to follow this  

5: Parallel processing and spectral slits

Recommended Posts

So far, we've seen that the ear-brain system response to auditory stimulus by a mechanism we call auditory critical bands, that the ears' absolute threshold of hearing varies with frequency and is responsive to roughly 120dB in dynamic range (for simple tones), that this dynamic range collapses drastically (to roughly 10-30dB) when in the presence of masking stimulus, and that many characteristics of a sound wave are involved in sound localization, particularly ILD and ITDs
Spectral slits and parallel processing represent characteristics of a spectrum and how the human ear responds to such
Consider the following burst of composite tones:
Not much to go on, but amplitude is the vertical axis and time the Y
The waveform describes a composite burst of tones in time
We know from auditory critical bands  that the human ear first 'spreads' this packet across the basilar membrane, in order to separate the composite signal into discrete tones 
Additionally, we know that auditory critical bands overlap and represent approximately 1mm of space
Now we introduce the concept of spectral slits, which are simply narrow windows of the selected spectra along certain regions of the basilar membrane (we now are looking at groups of 1mm critical bands)
For instance, after spreading the above composite signal along the basilar membrane, we can arbitrarily arrange the critical bands into 4 groups, or slits, of the spectrum
These four slits (or groupings of critical bands) can be seen to be representative of the original spectrum, illustrating that different portions of the basilar membrane are responsive to different frequencies 
The human ear processes these spectral slits in parallel, ultimately arriving at the original composite impulses
The snippet above happens to be from a bit of speech, and an interesting psychoacoustic side effect of this 'spread then process' technique follows:
Four spectral channels, distributed over the speech-audio range (0.3 – 6 kHz) are sufficient for human listeners to decode material with nearly 90% accuracy although more than 70% of the spectrum is missing.  Word recognition often remains relatively high (75-83%) when just two or three channels are presented concurrently, despite the fact that the intelligibility of these same slits, presented in isolation, is less than 9%
The partitioning of the speech bands was performed in logarithmic units, because psychophysical tuning, as measured by the equivalent rectangular bandwidth _ERBN, follows a simple logarithmic function in the mid- and high frequency regions, and is approximately constant in width at 1/6 octave.  
Let's explore this situation while introducing more than four slits
Prior to combination, the target stimuli were filtered into 30 contiguous frequency bands ranging from 80 to 7563 Hz 
The above shows that a minimum of 20 1/6 octave critical bands must be present to not effectively degrade signal integrity.
It's surprising how much information can be deleted while still maintaining extremely high intelligibility rates 

Share this post

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Create New...