Jump to content
Sign in to follow this  
RichP714

5: Parallel processing and spectral slits

Recommended Posts

So far, we've seen that the ear-brain system response to auditory stimulus by a mechanism we call auditory critical bands, that the ears' absolute threshold of hearing varies with frequency and is responsive to roughly 120dB in dynamic range (for simple tones), that this dynamic range collapses drastically (to roughly 10-30dB) when in the presence of masking stimulus, and that many characteristics of a sound wave are involved in sound localization, particularly ILD and ITDs
 
Spectral slits and parallel processing represent characteristics of a spectrum and how the human ear responds to such
 
Consider the following burst of composite tones:
 
20150612081919307.png 
 
Not much to go on, but amplitude is the vertical axis and time the Y
 
The waveform describes a composite burst of tones in time
 
We know from auditory critical bands  that the human ear first 'spreads' this packet across the basilar membrane, in order to separate the composite signal into discrete tones 
 
20150606105935397.png 
 
Additionally, we know that auditory critical bands overlap and represent approximately 1mm of space
 
 
20150606110324797.png 
 
Now we introduce the concept of spectral slits, which are simply narrow windows of the selected spectra along certain regions of the basilar membrane (we now are looking at groups of 1mm critical bands)
 
For instance, after spreading the above composite signal along the basilar membrane, we can arbitrarily arrange the critical bands into 4 groups, or slits, of the spectrum
 
20150612082911122.png 
 
These four slits (or groupings of critical bands) can be seen to be representative of the original spectrum, illustrating that different portions of the basilar membrane are responsive to different frequencies 
 
The human ear processes these spectral slits in parallel, ultimately arriving at the original composite impulses
 
The snippet above happens to be from a bit of speech, and an interesting psychoacoustic side effect of this 'spread then process' technique follows:
 
Four spectral channels, distributed over the speech-audio range (0.3 – 6 kHz) are sufficient for human listeners to decode material with nearly 90% accuracy although more than 70% of the spectrum is missing.  Word recognition often remains relatively high (75-83%) when just two or three channels are presented concurrently, despite the fact that the intelligibility of these same slits, presented in isolation, is less than 9%
 
The partitioning of the speech bands was performed in logarithmic units, because psychophysical tuning, as measured by the equivalent rectangular bandwidth _ERBN, follows a simple logarithmic function in the mid- and high frequency regions, and is approximately constant in width at 1/6 octave.  
 
Let's explore this situation while introducing more than four slits
 
20150612084945568.png 
 
Prior to combination, the target stimuli were filtered into 30 contiguous frequency bands ranging from 80 to 7563 Hz 
The above shows that a minimum of 20 1/6 octave critical bands must be present to not effectively degrade signal integrity.
 
It's surprising how much information can be deleted while still maintaining extremely high intelligibility rates 
 

Share this post


Link to post
Share on other sites
Stickers

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...