Oxford university press how the br ain evolved language



Yüklə 2,9 Mb.
Pdf görüntüsü
səhifə9/18
tarix06.06.2020
ölçüsü2,9 Mb.
#31814
1   ...   5   6   7   8   9   10   11   12   ...   18
how the brain evolved language


Figure 6.18. 
The cochlea as a piano soundboard. 
The cochlear nucleus 
The cochlear nucleus is not in the cochlea. It is the first auditory processing 
substation along the auditory nerve. In the cochlear nucleus, different cell types 
and networks process the auditory signal in at least four distinctive ways 
(Harrison and Howe 1974; Kelly 1985). 
First, large spherical neurons in the cochlear nucleus not only relay the 
hair cells’ signals but do so tonotopically. That is, they preserve the cochlea’s regu­
lar, keyboardlike mapping of different frequencies. Each spherical cell responds 
to only one narrow frequency range because each is innervated by only a few 
hair cells. Like narrowband spectrograms, these neurons transmit periodic, 
vocalic information to higher processing centers. 
Second, octopus cells in the cochlear nucleus (figure 6.20) respond to sig­
nals from wide frequency ranges of hair cells. Because they sample a broad 
Figure 6.19. 
Central auditory pathways. (Manter 1975. Reprinted by permission of 
F. A. Davis Company.) 

SPEECH  AND  HEARING 
•  105 
Figure 6.20. 
Octopus cells of the cochlear nucleus. 
spectrum, octopus cells are well designed to function like sound meters, mea­
suring the overall intensity of a complex sound. 
Third, because octopus cells receive inputs from many hair cells at many 
frequencies at once, they are capable of quickly depolarizing in response to 
brief stimuli like the many frequencies in a plosive burst. By contrast, it takes 
a relatively long time for the few hair cells in a single narrow frequency band 
to depolarize a spherical cell. 
Fourth, many octopus cells’ dendrites are arrayed to receive inputs either 
from high to low or from low to high (figure 6.20). Thus, these cells are differ­
entially sensitive to brief events such as the rising and falling formant transi­
tions which mark place of articulation (figure 6.14; Kelly 1985). 
From the cochlear nucleus afferent auditory axons enter the trapezoid body 
and cross over to the superior olivary nucleus (superior olive) on the contralat­
eral side of the brain stem. However, unlike vision and touch, not all auditory 
pathways cross. An ipsilateral (same-side) pathway also arises (i.e., from left ear 
to left cerebral hemisphere and from right ear to right cerebral hemisphere). 
Thus, each side of the brain can compare inputs from both ears. Sounds com­
ing from the left or right side reach the left and right ears at slightly different 
times, allowing the brain to identify the direction of sounds. Bernard Kripkee 
has pointed out to me how remarkable the ability to localize sound is. We 
learned in chapter 3 that brain cells fire at a maximum rate of about 400 spikes 
per second. Nevertheless, the human ear can readily distinguish sound sources 
separated by only a few degrees of arc. At the speed of sound, this translates 
into time differences on the order of 0.0004 s. This means the auditory system 
as a whole can respond about 100 times faster than any single neuron in it. 
Instead of the Von Neumannesque, all-or-nothing, digital response of a single 
neuron, many neurons working together in parallel produce a nearly analogue 
response with a hundredfold improvement in sensory resolution. 
Ultimately, pathways from both the cochlear nucleus and the superior 
olive combine to form the lateral lemniscus, an axon bundle which ascends to 
the inferior colliculus. The inferior colliculus (1) relays signals to higher brain 
centers (especially the medial geniculate nucleus of the thalamus, MGN) and (2) in 
doing so preserves the tonotopic organization of the cochlea. That is, the topo­

106  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
graphic arrangement by frequency which is found in the cochlea is repeated 
in the inferior colliculus. (And indeed is repeated all the way up to the cere­
brum!) The inferior colliculus has been clearly implicated in sound localiza­
tion, but little is known about the functions of the inferior colliculus with respect 
to speech. However, it is noteworthy that reciprocal connections exist both from 
the medial geniculate nucleus (MGN) and from the cerebral cortex back to 
the inferior colliculus, and these pathways will prove important to one of our 
first adaptive grammar models in chapter 7. 
Like the inferior colliculus, the medial geniculate nucleus relays afferent 
auditory signals to auditory cortex, retaining tonotopic organization. It also 
receives reciprocal signals from auditory cortex and sends reciprocal signals 
back to the inferior colliculus. Unlike the inferior colliculus, the medial gen­
iculate nucleus also exchanges information with thalamic centers for other 
senses, especially vision. As a result, some cross-modal information arises from 
thalamus to cortex. Moreover, much of this information seems to be processed 
in cerebrum-like on-center off-surround anatomies. 
Medial geniculate nucleus projections of the auditory nerve erupt into the 
cerebrum in the (auditory) koniocortex on the inner surface of the superior 
temporal gyrus, inside the Sylvian fissure. Perhaps because this area is relatively 
inaccessible to preoperative brain probes, it has been relatively little studied 
in the human case. Nevertheless, it can be inferred from dissection, as well as 
from many studies of mammals and primates, that koniocortex is character­
ized by tonotopic neuron arrays (tonotopic maps) which still reflect the tonotopic 
organization first created by the cochlea. Considerable research on tonotopic 
mapping has been done on mammalian brains ranging from bats (Suga 1990) 
to monkeys (Rauschecker et al. 1995). In fact, these brains tend to exhibit three, 
four, five, and more cerebral tonotopic maps. 
Since the same on-center off-surround architecture characterizes both vi­
sual and auditory cortex, the same minimal visual anatomies from which adap­
tive resonance theory was derived can be used to explain sound perception. 
We will first consider how auditory contrast enhancement may be said to oc­
cur, and then we will consider auditory noise suppression. 
Auditory contrast enhancement 
At night, a dripping faucet starts as a nearly inaudible sound and little by little 
builds until it sounds like Niagara Falls, thundering out all possibility of sleep. 
Figure 6.21 models this common phenomenon. Field 
1
 (remember that fields 
are superscripted while formants are subscripted) models a tonotopic, cochlear 
nucleus array in which the drip is perceived as a single, quiet note activating 
cell subliminally (“below the limen,” the threshold of perception). For con­
creteness, the higher field 
2
 may be associated with the inferior colliculus or 
the medial geniculate nucleus. The response at 

also graphs the (subliminal) 
response at t
1
, while the response at 
2
 also graphs the (supraliminal) response 
at some later t
2
. At t
1
 the response at cell x, stimulated only by the faucet drip, 
barely rises above the surrounding stillness of the night and does not cross the 

SPEECH  AND  HEARING 
•  107 
Figure 6.21. 
Auditory contrast enhancement. 
threshold of audibility. However, cell x is stimulated both by the drip and by 
resonant stimulation. At the same time, the surrounding cells, . . . , x – 2, x – 
and x + 1, x + 2, . . . , become inhibited by cell x, and as they become inhibited, 
they also disinhibit cell x, adding further to the on-center excitation of x. This 
process (“the rich get richer and the poor get poorer”) continues until finally, 
at t
n
, cell x stands out as loudly as any sound can. The contrast between the 
drip and the nighttime silence has become enhanced. 
Auditory noise suppression and edge detection 
Auditory noise suppression is closely related to contrast enhancement since 
both are caused by the dynamics of on-center off-surround neural anatomies. 
Noise suppression and an interesting, related case of spurious edge detection 
are illustrated in figure 6.22. 
Figure 6.22. 
White noise is suppressed; band-limited noise is not. 

108  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
In World War II communications research, it was found that white noise 
interfered with spoken communication less than band-limited noise. Figure 
6.22 presents a vowel spectrum (solid line) under conditions of both white and
band-limited noise (dotted lines). At the same amplitude, when noise is lim­
ited to a relatively narrow frequency band, there is “less” of it than there is of 
white noise, which covers the entire frequency range. Nevertheless, the band-
limited noise interferes more with speech communication. This occurs because, 
under white noise, the on-center off-surround neural filter causes the noise to 
be uniformly suppressed across the entire spectrum. At the same time, con­
trast enhancement picks out the formant peaks and emphasizes them above 
the background noise. Thus, in figure 6.22, under the white-noise condition, 
the perceived formant spectrum is preserved at time c, after neural process­
ing. The band-limited noise, however, introduces perceptual edges which, like 
the edges in figure 5.4b, become enhanced. By time (figure 6.22), the speech 
spectrum has been grossly distorted: the perceptually critical second formant 
peak has been completely suppressed and replaced by two spurious formant 
peaks introduced at the edges of the band-limited noise. This further illustrates 
the perceptual phenomenon of edge detection, which occurs because in on-center 
off-surround processing, the middle of the band-limited noise is laterally in­
hibited and suppressed from both sides, while the edges are suppressed only 
from one side. 
In this chapter, we have looked at the basic human motor-speech and auditory 
systems, and we have seen how these systems produce, sample, sense, and pro­
cess sound and perceive it as the most fundamental elements of speech. In 
particular, we examined basic examples of noise suppression, contrast enhance­
ment, and edge detection in on-center off-surround ART networks. In chap­
ter 7 we will extend these findings from these atomic, phonetic speech sounds 
and phenomena to the phonemic categories of speech. 

SPEECH PERCEPTION

109
• 




N
• 
Speech Perception
In chapter 6 we described the nature of the speech signal and how its image is 
sensed and presented to the cerebrum. We noted that because no two vocal 
tracts are exactly alike, your pronunciation will differ subtly but certainly from 
my pronunciation. To express these subtle, phonetic differences, linguists in­
vented the International Phonetic Alphabet (IPA). In the fifteenth century, the 
Great English Vowel Shift caused the writing system of English to deviate from 
continental European systems, so IPA looks more like French or Italian spell­
ing than English. Thus, when you say beet, we might write it in IPA as [bit], and 
if I pronounce my [i] a little further forward in my mouth, we could capture 
this detail of my pronunciation in IPA as [bi
+
t]. The sounds of the letters of 
IPA correspond to phones, and the brackets tell us we are attempting to cap­
ture pronunciation phonetically, that is, as accurately as possible. 
But even though you say [bit] and I say [bi
+
t], we both perceive the word 
beet. This is a small miracle, but it is very significant. In one sense, language 
itself is nothing but a million such small miracles strung end to end. No two 
oak trees are exactly alike either, but 99% of the time you and I would agree 
on what is an oak and what isn’t an oak. This ability to suppress irrelevant de­
tail and place the objects and events of life into the categories we call words is 
near to the essence of cognition. In linguistics, categorically perceived sounds 
are called phonemes, and phonemic categories are distinguished from phonetic 
instances by enclosing phonemes in slashes. Thus, for example, you say [bit] 
and I say [bi
+
t], but we both perceive /bit/. 
Whereas in chapter 6 we found much to marvel at in the neural production 
of phones, we turn now to the still more marvelous phenomena of categorical 
neural perception of phonemes. To understand speech perception, and ultimately 
cognition, we now begin to study how minimal cerebral anatomies process the 
speech signal. The result of this exercise will be the first collection of hypoth­
eses that define adaptive grammar. 
109 

110  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
Voice Onset Time Perception 
The words beet and heat differ by only their initial phonemes: /bit/ versus 
/hit/. Linguists call such pairs of words minimal pairs. More minimally still, beet 
and peat only differ on a single feature. The phonemes /b/ and /p/ are both 
bilabial plosive (or stop) consonants. The only difference is that /b/ is a voiced 
consonant, and /p/ is an unvoiced consonant. The words /bit/ and /pit/ dif­
fer only on the manner feature of voicing. Such minimal pairs isolate the cat­
egorical building blocks of language and provide an excellent laboratory in 
which to begin the study of cognition. 
The difference between voiced and unvoiced sounds was long said to be 
that the vocal cords vibrated during production of voiced sounds but not dur­
ing the production of unvoiced sounds. This is true enough, but as we have 
seen, the production of speech sounds is only one-half of the language equa­
tion. Following the invention of the sound spectrograph in the late 1940s it 
became possible for researchers to study the other half, the perception of speech 
sounds. In a landmark study, Liberman et al. (1952) used spectrography to 
measure the plosive voicing contrast against how it is perceived by listeners. For 
example, spectrograms of /p/ and /b/ in paid and bade are given figure 7.1. 
The spectrograms for both /p/ and /b/ begin with a dark, vertical band which 
marks the initial, plosive burst of these consonants. (We will examine the third 
spectogram in figure 7.1 a little later in this chapter.) These are followed by the 
dark, horizontal bands of the formants of /e/. Finally, each spectrogram ends 
with another burst marking the final /d/. It is difficult to find much to say about 
nothingness, so one might believe, as linguists long did, that the most significant 
difference between /p/ and /b/ is the aspiration following /p/. This is the high-
frequency sound in figure 7.1, appearing after the burst in paid. From a listener’s 
perspective, however, such aspiration falls outside the tuning curve of the ear canal 
and is too faint to be reliably heard. It might as well be silence. And indeed, in 
1957 Liberman et al. found that it was the silence following a plosive burst which 
distinguished /p/ and /b/. They called this silent interval voice onset time (VOT). 
Figure 7.1. 
Spectrograms of [bed], [ped], and [
m
bed] (Spanish-like prevoicing). 

SPEECH  PERCEPTION 
•  111 
Marked VOT in figure 7.1, voice onset time is usually measured from the 
burst to the beginning of the lowest dark band on the spectrogram, the voicing 
bar. Once researchers recognized that silence could be a highly contrastive 
speech cue, it was easy to see from spectrograms how VOT could be a highly 
salient feature of speech. Using synthesized speech, numerous studies quickly 
verified that VOT was the primary feature distinguishing voiced and unvoiced 
consonants and that this distinction applied universally across languages (Lisker 
and Abramson 1964). 
It was soon discovered that these perceptual distinctions were also categori-
cal. In 1957, Liberman et al. presented listeners with a series of syllables which 
varied in VOT between 0 and 50 ms (in IPA we might represent the stimuli as 
[ba], [b
+
a], [b
++
a], etc.). They asked listeners to identify these syllables as ei­
ther /ba/ or /pa/. As figure 7.2 shows, they found an abrupt, categorical shift 
toward the identification of /pa/ when VOT reached 25 ms. Initial plosives 
with VOTs under 25 ms were perceived as voiced, while those with longer VOTs 
were perceived as unvoiced. It was as if a binary switch flipped when VOT 
crossed the 25 ms boundary. 
This metaphor of a binary switch was particularly attractive to generative 
philosophers, who viewed language as the product of a computational mind, 
and the metaphor took on the further appearance of reality when Eimas, 
et al. (1971) demonstrated that even extraordinarily young infants perceived 
the voiced-voiceless distinction categorically. In a series of ingenious studies, 
Eimas and his coworkers repeated synthetic VOT stimuli to infants as young as 
one month. 
In these experiments, the infants were set to sucking on an electonically 
monitored pacifier (figure 7.3). At first, the synthetic speech sound [ba]
0
 (i.e., 
VOT = 0) would startle the neonates, and they would begin sucking at an ele­
vated rate. The [ba]
0
 was then repeated, synchronized with the infant’s suck­
ing rate, until a stable, baseline rate was reached: the babies became habituated 
to (or bored with) [ba]
0

Figure 7.2. 
Categorical perception. 

112  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
Figure 7.3. 
Eimas et al.’s (1971) “conjugate sucking” paradigm. 
Then the stimulus was changed. If it was changed to [ba]
30
, the infants 
were startled again. They perceived something new and different, and their 
sucking rate increased. If, however, the new stimulus was [ba]
10
 or [ba]
20
 and 
did not cross the 25 ms. VOT boundary, the babies remained bored. They per­
ceived nothing new, and they continued to suck at their baseline rate. 
This study was replicated many times, and the conclusion seemed inescap­
able: Chomsky’s conjecture on the innateness of language had been experi­
mentally proved. Neonates had the innate capacity to distinguish so subtle and 
language-specific a feature as phonemic voicing! But then the study was repli­
cated once too often. In 1975, Kuhl and Miller replicated the Eimas study— 
but with chinchillas! Obviously, categorical perception of VOT by neonates was 
not evidence of an innate, distinctively human, linguistic endowment. 
Figure 7.4 explains both infants’ and chinchillas’ categorical perception 
of the voiced-voiceless contrast as the result of species-nonspecific dipole com­
petition. In figure 7.4a, the left pole responds to an aperiodic plosive burst at 
= 0 ms. Despite the brevity of the burst, feedback from 
2
 to 
1
 causes site u

to become persistently activated. This persistent activation also begins lateral 
inhibition of v
1
 via i
uv
. When the right pole is later activated at v
0
 by the peri­
odic inputs of the vowel (voice onset at > 25 ms), inhibition has already been 
established. Because v
1
 cannot fire, v
2
 cannot fire. Only the unvoiced percept 
from u

occurs at 
2

i
In figure 7.4b, on the other hand, voice onset occurs at t < 25 ms. In this 
case, v
1
 reaches threshold, fires, and establishes feedback to itself via v

before 
uv
 can inhibit v
1
. Now, driven by both v
2
v
1
 feedback and v
0
v
1
 feedforward 
inputs, i
vu 
can inhibit u
1
, and a voiced percept results at v
2

Figure 7.4 also explains more subtle aspects of English voicing. For example, 
the [t] in step is perceived as an unvoiced consonant, but acoustically, this [t] 
is more like a /d/: it is never aspirated, and its following VOT is usually less 
than 25 ms. How then is it perceived as a /t/? In this case, figure 7.4 suggests 

SPEECH  PERCEPTION 
•  113 
Figure 7.4. 
A VOT dipole. (a) Unvoiced percept. (b) Voiced percept. 
that the preceding /s/ segment excites the unvoiced pole of figure 7.4, so it 
can establish persistent inhibition of the voiced pole without 25 ms of silence. 
It predicts that if the preceding /s/ segment is synthetically shortened to less 
than 25 ms, the [t] segment will then be heard as a voiced /d/. 
The differentiation of wideband and narrowband perception has a plausible 
macroanatomy and a plausible evolutionary explanation. Figure 7.4 models a 
cerebral dipole, but dipoles also exist ubiquitously in the thalamus and other 
subcerebral structures—wherever inhibitory interneurons occur. In this case, the 
unvoiced pole must respond to a brief burst stimulus with a broadband spectrum. 
It is therefore plausible to associate 
0
 with the octopus cells of the cochlear 
nucleus since, as we saw in chapter 6, this is exactly the type of signal to which 
they respond. Similarly, we associate v
0
 with the tonotopic spherical cells of the 
cochlear nucleus. We associate 
1
 of figure 7.4 with the inferior colliculus and 
medial geniculate nucleus. It is known that the octopus cells and spherical cells 
send separate pathways to these subcortical structures. Lateral competition be­
tween these pathways at 
1
 is more speculative. The inferior colliculus has been 
mostly studied as a site computing interaural timing and sound localization 
(Hattori and Suga 1997). However, both excitatory and inhibitory cell types 
are present, and it is probable that such lateral inhibition does occur at the 
inferior colliculus (Pierson and Snyder-Keller 1994). Lateral competition is 
a well-established process at the levels of the medial geniculate nucleus, the 
thalamic reticular formation, and cerebral cortex (grouped as F
2
 in figure 

114  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
7.4; Suga et al. 1997). For simplicity, however, we diagram lateral competition 
only at 
1
 in figure 7.4. Likewise, reciprocal feedforward-feedback loops like u
1
– 
u
2
–u

and v
1
–v
2
–v
1
 are found at all levels of auditory pathways, but figure 7.4 
emphasizes feedback loops from u
2
 to u

following Suga and his colleagues 
(Ohlemiller et al. 1996; Yan and Suga 1996; Zhang et al. 1997), who have iden­
tified what is presumably homologous “FM” feedback circuitry from cerebrum 
to inferior colliculus in bats. Finally, at typical central nervous system (CNS) sig­
nal velocities of 1 mm/ms, note that the circuits of figure 7.4 are also reasonably 
scaled for categorical perception centered around a VOT of 25 ms. 
Some languages, like Thai and Bengali, map prevoiced (VOT 
≤ 25ms), 
voiced (VOT 
≈ 0 ms), and unvoiced plosive phones (VOT > 25 ms) to three 
different phonemic categories, and replications of the Eimas study using this 
wider range of stimuli suggest that neonates can also perceive VOT in three 
categories. Nevertheless, most languages, including the European languages, 
divide the VOT continuum into only two phonemic categories: voiced and 
unvoiced. The problem is that these languages divide the continuum in differ­
ent places, so before the sound spectrograph came along, this situation con­
fused even trained linguists. For example, whereas English and Chinese locate 
the voicing crossover at 25 ms, so that “voiced” /b/ < 25 ms < “unvoiced” /p/, 
Spanish locates its voicing crossover at 0 ms., so that “voiced” /b/ < 0 ms < 
“unvoiced” /p/. That is, Spanish /b/ is prevoiced, as in figure 7.1. 
As one result, when Spanish and Portuguese missionaries first described 
the Chinese language, they said it lacked voiced consonants, but if these same 
missionaries had gone to England instead, they might well have said that En­
glish lacked voiced consonants. Subsequently, this Hispanic description of 
Chinese became adopted even by English linguists. In the Wade-Giles system 
for writing Chinese in Roman letters, /di/ was written ti and /ti/ was written 
t’i, and generations of English learners of Chinese have learned to mispro­
nounce Chinese accordingly, even though standard English orthography and 
pronunciation would have captured the Chinese voiced/voiceless distinction 
perfectly. 
Because of its species nonspecificity, subcerebral origins, and simple dipole 
mechanics, categorical VOT perception probably developed quite early in ver­
tebrate phylogeny. It is quite easy to imagine that the ability to discriminate 
between a narrowband, periodic birdsong and the wideband, aperiodic snap­
ping noise of a predator stepping on a twig had survival value even before the 
evolution of mammalian life. Still, it is not perfectly clear how figure 7.4 ap­
plies to the perception of Spanish or Bengali. As surely as there are octopus 
cells in the cochlear nucleus, the XOR information of the voicing dipole is 
present, but how it is used remains an issue for further research. 
Phoneme Learning by Vowel Polypoles 
A few speech features such as VOT may be determined by subcortical process­
ing, but most speech and language features must be processed as higher cog­

SPEECH  PERCEPTION 
•  115 
nitive functions. Most of these features begin to be processed in primary audi­
tory cortex, where projections from the medial geniculate nucleus erupt into 
the temporal lobe of the cerebrum. Although a few of these projections might 
be random or diffuse signal pathways, a large and significant number are co­
herently organized into tonotopic maps. That is, these projections are spatially 
organized so as to preserve the frequency ordering of sound sensation that was 
first encoded at the cochlea. 
PET scans and MRI scans have so far lacked sufficient detail to study hu­
man tonotopic maps, so direct evidence of tonotopic organization in humans 
is sparse. Animal studies of bats and primates, however, have revealed that the 
typical mammalian brain contains, not one, but many tonotopic maps. The bat, 
for example, exhibits as many as five or six such maps (Suga 1990). 
It is rather impressive that this tonotopic order is maintained all the way 
from the cochlea to the cerebrum, for although this distance is only a few 
centimeters, some half-dozen midbrain synapses may be involved along some 
half-dozen distinct pathways. Moreover, no fewer than three of these path­
ways cross hemispheres, yet all the signals reach the medial geniculate nucleus 
more or less in synchrony and project from there into the primary auditory 
cortex, still maintaining tonotopic organization. In humans, tonotopic orga­
nization implies that the formant patterns of vowels, which are produced in 
the vocal tract and recorded at the cochlea, are faithfully reproduced in the 
cerebrum. The general structure of these formant patterns was presented in 
chapter 5. What the cerebrum does with tonotopic formant patterns is our 
next topic. 
To model how phones like [i] become phonemes like /i/, we return to the 
on-center off-surround anatomy, which we now call a polypole for simplicity. For 
concreteness, imagine an infant learning Spanish (which has a simpler vowel 
system than English), and consider how the formant pattern of an [i] is projected 
from the cochlea onto the polypoles of primary auditory cortex (A
1
) and ter­
tiary auditory cortex (A
3
) in figure 7.5. (We will discuss A
2
 at the end of this 
chapter.) If we take the infant’s primary auditory cortex to be a tabula rosa at birth, 
then its vector of cortical long-term memory traces, drawn in figure 7.5 as modi­
fiable synaptic knobs at A
3
, is essentially uniform. That is, z

= z
2
 = . . . = z
n

When the vowel [i] is sensed at the cochlea and presented to polypoles A

and A
3
, the formants of the [i] map themselves onto the long-term memory 
traces z
i
 between A
1
 and A
3

Feature Filling and Phonemic Normalization 
In figure 7.5, lateral inhibition across A
1
 and A
3
 contrast-enhances the pho­
neme /i/. Thus, the formant peaks at A
3
 become more exaggerated and bet­
ter defined than the original input pattern. This has the benefit of allowing 
learned, expectancy feedback signals from the idealized phoneme pattern 
across A
3
 to deform the various [i]s of different speakers to a common (and, 
thus, phonemic) pattern for categorical matching and recognition. 

116  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
Yüklə 2,9 Mb.

Dostları ilə paylaş:
1   ...   5   6   7   8   9   10   11   12   ...   18




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin