Figure 5.4.
(a–b) A cerebral gyrus simulated by equations 5.2–5.4. (c–e) A cerebral
gyrus simulated by equations 5.2–5.4. (f) A gyrus of macaque striate cortex radio
graphically labeled by tritium. (LeVay et al. 1985. Reprinted by permission of the
Society for Neuroscience.)
Edge detection
In figure 5.4b, inputs continue to be applied through t = 1, and a resonant
pattern begins to develop. Edge detection emerges as another inherent property
of the ART system: the non-edge nodes at [10 11] and [10 13] are actively in
hibited on both flanks (by each other, as well as by the edge nodes [10 9] and
[10 15]). On the other hand, the edge nodes are each actively inhibited only
84 •
HOW THE BRAIN EVOLVED LANGUAGE
by nodes on one interior flank (by [10 11] or [10 13]). Consequently, the edges
are less inhibited, more active, and more perceptually detectable.
Normalization
A fourth property of on-center off-surround anatomies is normalization. Con
sider as an example the intensity of speech sounds. For human hearing, the
threshold of pain is somewhere around 130 dB. What happens at a rock con
cert, where every sound may be between 120 dB and 130 dB? In the on-center
off-surround anatomy of auditory cortex, if one is not first rendered deaf, a
frequency component at 130 dB will inhibit its neighbors. A neighboring fre
quency component at 120 dB can thus be inhibited and perceived as if it were
only 80 or 90 dB. In this way, an on-center off-surround anatomy accomplishes
a kind of automatic gain control—normalization that prevents the system from
saturating at high input levels.
On-center off-surround anatomies accomplish a similar kind of normal
ization in vision. At sunset, daylight is redder than it is at noon. Nevertheless,
at sunset we still perceive white as white, not as pink. In this case, an on-center
off-surround perceptual anatomy keeps the world looking normal even as its
optical characteristics physically change.
Rebounds
Figure 5.4c displays NSA applied at t = 2 to all nodes in the gyrus. This is analo
gous to the “flashbulb effect” discussed with reference to figure 5.3. After NSA,
at t = 3 (figure 5.4d), a rebound occurs: the target cells, which previously were
“on” relative to their surround, are now “off.” Note that the rebound does not
only occur on the local scale of the original target barrels. A larger scale, self-
similar rebound can also occur over the left and right hemifields of the gyrus
(fields [ x 1–20] vs. [ x 21–48]). Rebounds occur on large, multicellular scales
just as they occur on smaller, cellular scales. This capacity for fieldwide rebounds
is critical to preserving long-term memory invariance.
Complementation and long-term memory invariance
Long-term memory is, by definition, invariant: A pattern once learned (like
that in figure 5.4a) should not be easily forgotten. But amid all the busy, buzz
ing, resonant neural activity suggested by figure 5.4, what is to prevent a re
membered pattern like that of figure 5.4a from being overwritten by other,
conflicting inputs? Rebounds are the answer to this cognitive problem, too.
Suppose that I ask you to learn the pattern in figure 5.4a, but after you have
studied it for a little while, I change my mind and say, “No, stop. Now learn
this second pattern.” In later chapters we will see how my No instruction causes
NSA and a general rebound, putting your mind in the tabula-not-quite-rasa state
of figure 5.4d. If, while you are in this new state, I present you with a new pat
ADAPTIVE RESONANCE
•
85
tern to learn, the original target nodes will be inactive. By equation 5.2, synap
tic learning at those nodes will also be inactivated. (Let the rebounded, inhib
ited target nodes, x
r
, in figure 5.4e have short-term memory (STM) activation
levels of 0, i.e., x
r
= 0. Then by equation 5.2, Ex
i
x
r
= 0, so all z
ir
(0.) In this way,
rebounds complement memory, partitioning it and preventing new information
from overwriting old, long-term memories.
Ocular dominance columns
If the resonance begun in figure 5.4a–b is allowed to resonate without conflict
ing inputs, the gyrus eventually achieves the pattern of figure 5.4e. Figure 5.4e
bears a striking resemblance to ocular dominance columns in primary visual (stri
ate) cortex (figure 5.4f).
Wiesel and Hubel (1965; Wiesel et al. 19974) sutured one eye of newborn
kittens closed. After several months, this caused neocortical cells that were wired
to the sutured eye to become permanently unresponsive. Similar suturing of
adult cats had no such effect, so Wiesel and Hubel proposed that there existed
a critical period during which vision had to become established by experience.
Subsequent radiographic staining techniques allowed Wiesel, Hubel, and Lam
(1974) and others to make dramatic pictures showing that eye-specific striate
cortex cells were arranged in stripes, or “columns.” Figure 5.4f is one such pic
ture, in which white stripes are neurons responding to one eye (not sutured
but radiographically labeled by tritium; LeVay et al. 1985). Like other sensory
systems, the visual system had been known to exhibit a retinotopic mapping
from peripheral receptors in the eye up through cortex. Ocular dominance
columns appeared as one more remarkable instance of such topographical
mapping. (The tonotopic mapping of the auditory system will figure prominently
in our next several chapters on speech and speech perception.)
Hubel and Wiesel’s studies exerted a broad influence in science, and Lenne
berg (1967) proposed that a similar “critical-period” explanation could be ex
tended to language. Coupled with Chomsky’s speculations on the innateness of
language, Lenneberg’s thesis suggested that there existed a detailed genetic plan
for grammar, just as there could be supposed to exist a detailed genetic plan in
which segregation of thalamocortical axons formed the anatomic basis for detailed
ocular dominance columns. Because only about 10
5
human genes are available
to code the 10
8
-odd axons innervating striate cortex (not to mention Broca’s area,
Wernicke’s area, and all the rest of the human brain), this suggestion was never
adequate, but in the absence of better explanations it was widely accepted.
We will return to these issues of critical periods and neuronal development
in later chapters. For now, it remains only to establish that the similarities be
tween our model gyrus and real neocortex are not fortuitous. To this end, note
that the similarities between figure 5.4e and figure 5.4f are not only impres
sionistic and qualitative but also quantitative: the diameter of the stripes in
figure 5.4e is approximately two barrels—the radial extent of a barrel’s inhibi
tory surround in the model gyrus. It also happens that the width of Wiesel and
86 •
HOW THE BRAIN EVOLVED LANGUAGE
Hubel’s ocular dominance columns is 0.4 mm—approximately the radial ex
tent of cerebral inhibitory cells’ inhibitory surround.
6
XOR
The potential advantages of massively parallel computers have been vaguely
apparent for quite some time. In fact, the first large-scale parallel computer,
the 64,000-element Illiac IV, became operational in the mid-1960s. But before
the invention of microchips, parallel machines were prohibitively expensive,
and once built, the Illiac IV proved extremely difficult to program. The lead
ing parallel-computing idea of the day was Rosenblatt’s perceptron model (Rosen
blatt 1958, 1959, 1961), but in 1969 Minsky and Papert’s widely influential book
Perceptrons (1969; see also Minsky and Papert 1967) claimed to prove that percep
trons were incapable of computing an XOR (see below). As a result, they ar
gued, perceptrons were incapable of calculating parity, and therefore incapable
of performing useful computational tasks.
7
Because of this argument, XOR has
figured prominently in subsequent parallel-computing theory. We will return
to this issue, but for the present, consider only that dipoles calculate XOR.
XOR, or “exclusive OR,” means “A or B but not both.” Formally, it is a
Boolean logical operation defined for values of 1 and 0 (or true and false).
For the four possible pairs of 1 and 0, XOR results in the values given in table
5.2. XOR is 1 (true) if A or B—but not both—is 1. This is the same function
that gated dipoles computed in figure 5.3 and 5.4. Grossberg (1972a, passim)
found that gated dipoles compute XOR ubiquitously in the brain, as an essen
tial and natural function of agonist-antagonist relations and lateral inhibition.
Calculating parity is no longer thought essential to computation, but as we have
seen and as we shall see, dipoles and XOR are essential to noise suppression,
contrast enhancement, edge detection, normalization, long-term memory in
variance, and a host of other indispensable properties of cognition.
Opportunistic Learning with Rebounds
The rate of neurotransmitter release from the presynaptic terminal is not con
stant. When a volley of signal spikes releases neurotransmitter, it is initially
released at a higher rate. With repeated firing, the rate of release of neurotrans
mitter decreases. As suggested previously, this constitutes synaptic habituation
(see also Klein and Kandel 1978, 1980).
TABLE 5.2.
Truth table for exclusive OR (XOR).
A:
1
1
0
0
B:
1
0
1
0
XOR
0
1
1
0
ADAPTIVE RESONANCE
•
87
A plausible mechanical explanation for this habituation is that while a knob
is inactive, neurotransmitter accumulates along the synaptic membrane (see
figure 3.7). When the first bursts of a signal volley depolarize this membrane,
the accumulated transmitter is released. The synapse receives a large burst of
neurotransmitter and a momentary advantage in dipole competition. There
after, transmitter continues to be released, but at a steadily decreasing rate.
Meanwhile, the inhibited pole of a dipole is dominated, but it is not slavish.
All the while it is dominated, it accumulates neurotransmitter reserves along
its synaptic membranes, preparing itself for an opportunistic rebound.
Rebounds make for opportunistic learning in a manner reminiscent of
Piagetian accommodation (Piaget 1975). Learning implies the learning of new
information, so if old information is not to be overwritten when the new infor
mation is encoded, a rebound complements and repartitions memory to ac
commodate the new information.
Expectancies and Limbic Rebounds
A few pages back it was suggested that a rebound could be caused by a teacher
saying, “No, wait. Learn a different pattern.” But what if there is no teacher?
For billions of years, intelligence had to survive and evolve in the “school of
hard knocks”—without a teacher. How could rebounds occur if there were no
teacher to cause them?
In figure 5.4b, we saw secondary, resonant fields form, encoding the input
pattern presented in figure 5.4a. In a richer environment, these secondary
resonant fields encode not only the immediate input but also, simultaneously,
the associated context in which input occurs. In the future, the learned z
ij
traces
of context alone can be sufficient to evoke the target pattern. These second
ary fields are contextual expectancies, or, as Peirce put it in 1877, “nervous asso-
ciations—for example that habit of the nerves in consequence of which the
smell of a peach will make the mouth water” (9).
In 1904, Pavlov won the Nobel Prize for discovering (among other things)
that associations could be conditioned to arbitrary contexts. For example, if
one rings a bell while a dog smells a peach, one can later cause the dog’s mouth
to water by just ringing the bell. In 1932, Tolman gave such abstract contexts
the name we use here— expectancies. Grossberg’s suggestion (1980, 1982a) was
that failed expectancies could trigger rebounds, thereby making it possible for
animals and humanoids to learn new things during the billion years of evolu
tion that preceded the appearance of the first teacher.
Evoked Potentials
As we have seen, the action potentials of neurons are electrical events, and in
1944, Joseph Erlanger and Herbert Gasser received the Nobel Prize for devel
oping electronic methods of recording nerve activity. One outgrowth of their
88 •
HOW THE BRAIN EVOLVED LANGUAGE
work was the ability to measure evoked potentials—the voltage changes that a stimu
lus evokes from a nerve or group of nerves. Allowing for variation among differ
ent nerve groups, a fairly consistent pattern of voltage changes has been observed
in response to stimuli. During presentation of a stimulus sentence like
Mary had a little lamb, its fleece was white as coal
(5.5)
CNV
P100 P300 N400
a series of voltage troughs and peaks labeled CNV, P100, P300, and N400 may
be recorded by scalp electrodes positioned over language cortex. CNV, or con-
tingent negative variation is a negative voltage trough that can be measured as a
usual correlate of a subject’s expectation of a stimulus, for example, sentence
5.5. A positive peak, P100, can then be recorded 100 ms after stimulus onset,
when a stimulus is first perceived. If, however, the stimulus is unrecognized—
that is, if it is unexpected—a P300 peak is reliably evoked, approximately 300 ms
after stimulus onset. The rapid succession of syllables in a sentence like 5.5 can
obscure these potentials on all but the last word, but in cases of anomalous
sentences like 5.5 (Starbuck 1993), there is a widespread N400 “anomaly com
ponent” which is clearly detectable over associative cortex.
Grossberg’s theory (1972b, 1980; Grossberg and Merrill 1992) is that P300
corresponds to a burst of NSA that is triggered by the collapse of an expecta
tion resonance. More specifically, Grossberg (1982a) suggests that old knowl
edge and stable expectancies have deep, subcortical resonances extending even
to the hippocampus and limbic system. If events in the world are going accord
ing to the mind’s plan, then heart rate, breathing, digestion, and all the other
emotions and drives of the subcortical nervous system are in resonance with
the perceived world, but a disconfirming event, whether it be the unexpected
attack of a predator, a teacher’s No! or the simple failure of a significant ex
pectancy causes this harmonious resonance to collapse. This collapse unleashes
(disinhibits) a wave of NSA, causing a rebound and making it possible for the
cerebrum to accommodate new information.
Grossberg’s P300 theory is also compatible with Gray’s “comparator” theory
of hippocampal function, and together they give a satisfying explanation of
certain facts discussed in chapter 4, like HM’s anterograde amnesia. Accord
ing to this Gray-Grossberg theory, HM could not learn new things because, with
a resected hippocampus, he could not compare cerebral experience against
his subcerebral emotional state and so could not generate rebounds. Without
rebounds, his cerebral cortex could not be partitioned into active and inactive
sites, so inactive sites could not be found to store new memories.
Although HM could not remember new names and faces, he could retain
certain new, less primal long-term memories such as the solution to the Tower
of Hanoi puzzle. By the Gray-Grossberg theory, HM might have been able to
retain these memories because nonemotional experiences are relatively non-
limbic and may therefore depend to a lesser extent on limbically generated
NSA. As we saw from figure 5.3, NSA is not the only way to trigger a rebound.
ADAPTIVE RESONANCE
•
89
Turning down inputs by shifting attention (or closing the eyes) can generate a
rebound just like turning up inputs. This is why we are often able to solve a
problem by “sleeping on it.” Turning off the inputs that have been driving our
cognitive processes allows alternative hypotheses to rebound into activity and
successfully compete for LTM storage.
Grossberg’s P300 hypothesis suggests that Gray’s comparator theory might
be tested by evoked potentials, but unfortunately, the hippocampus and other
key limbic centers lay deep within the brain, inaccessible to measurement by
scalp electrodes and even by sophisticated devices like magnetoencephalograms
(MEG scans). On the other hand, measurement techniques which can “go
deep,” like positron emission tomography (PET) scans and functional magnetic reso-
nance imaging (fMRI) scans, resolve events on a scale of seconds at best—far
off the centisecond timescale on which evoked potentials occur.
Sequential parallel search
Repeated rebounds can enable a sequential parallel search of memory. If a
feedforward input pattern across some field F
1
does not match (that is, reso
nate with) an expected feedback pattern across F
2
, a burst of NSA can cause a
rebound across F
2
, and a new feedback pattern can begin to resonate across
F
2
. If this new F
2
expectancy does not match the input, yet another rebound
can cause yet a third feedback pattern to be instantiated across F
2
. This pro
cess can repeat until a match is found.
Peirce (1877) described this process as abduction, which he contrasted with
the classical logical processes of induction and deduction. He exemplified it
with the story of Kepler trying to square Copernicus’s theoretical circular or
bits with the planets’ occasional retrograde motion. Kepler tried dozens of
hypotheses until he finally hit upon the hypothesis of elliptical orbits. While
Kepler’s trial-and-error abductive process was not logical in any classical sense,
it did reflect the logic of expectancies, trying first the expected classical, Eu
clidean variants of circular orbits before finding resonance in the theory of
elliptical orbits.
Such searches are not random, serial searches through all the 10
7,111,111
-odd
patterns in the mind. Rebounds selectively occur in circuits which have mini
mal LTM (minimal neurotransmitter reserves) and maximal STM activity. By
weighing experience against present contextual relevance, the search space is
ordered, and—unlike a serial computer search through a static data structure—
abduction quickly converges to a best resonance.
In this chapter we have modeled cerebral dynamics using only a handful of
basic differential equations. These equations nevertheless enabled us to build
a computer simulation which exhibits subtle cognitive dynamics and surpris
ing similarities to real mammalian neocortex. Many of these dynamic proper
ties are especially evident in language phenomena, to which we will turn in
chapter 7. First, however, chapter 6 must give the reader a quick survey of the
physics and physiology of speech and hearing.
90
•
HOW THE BRAIN EVOLVED LANGUAGE
•
S
I
X
•
Speech and Hearing
In the previous chapters we reviewed the central nervous system and introduced
adaptive resonance theory (ART) as a system for modeling it. Most of the data
presented in those chapters dealt with vision. Language, however, deals pri
marily with sound. Therefore, in this chapter we will first consider some essen
tial, physical facts of speech. We will then see how these physical data are
processed by the ear and the auditory nervous system into the signals that the
cerebrum ultimately processes as language.
Speech
Periodic sounds
Speech may be divided into two types of sounds: periodic and aperiodic. These
types correspond roughly to the linguistic categories of vowel and consonant.
We will focus on these categories because our first linguistic neural networks
(in chapter 7) will model how we perceive them.
Vowels are the prototypic periodic speech sounds. They originate when
air is forced from the lungs through the glottis, the opening between the
vocal cords of the voice box, or larynx (figure 6.1).
1
When air flows between
them, the vocal cords vibrate. In figure 6.2, a single string, which could be a
violin string, a piano string, or a vocal cord, is shown vibrating.
Each time the string in figure 6.2 stretches to its right, air molecules are
set in motion to the right. They bump against adjacent molecules and then
bounce back to the left. In this manner a chain reaction of compressed “high
pressure areas” is set in motion to the right at the speed of sound. (Note that
molecule A does not wind up at position Z 1 second later. Rather, one must
imagine that molecule B bumps molecule C to the right. Molecule B then
bounces back and bumps molecule A to the left. Eventually, Y bumps Z. When
90
SPEECH AND HEARING
•
91
Figure 6.1.
The larynx.
the string springs back to the left, alternating low-pressure areas are formed
between the high-pressure areas. The resulting sound waveform spreads like
the waves from a pebble dropped in a still pond. If we plot these high- and low-
pressure areas (figure 6.2), we get a sinusoidal (sine-wave-shaped) sound wave.
These waves are commonly seen when a microphone transduces high and low
air pressure waves into electrical signals displayed on an oscilloscope.
The ordinate of the plot (AMP in figure 6.2) shows how the pressure rises
and falls. This measure of rise and fall is the wave’s amplitude. It corresponds
to the sound’s intensity or loudness. (The human ear does not hear very
low or very high sounds, so technically, “loudness” is how intense a sound
seems.) In figure 6.2 the wave is shown to complete two high-low cycles per
second. The wave’s measured frequency in figure 6.2 is therefore two hertz
(2 Hz).
2
Every string has a natural fundamental frequency ( f
0
), which depends upon
its length and elasticity. When a string of given length and elasticity vibrates as
a whole, as in figure 6.2, it produces the first harmonic ( H
1
). But if we divide
the string in half by anchoring its midpoint, as in figure 6.3, each half-string
will vibrate twice as fast as the whole. The plot of the resulting high- and low-
pressure areas will be a wave with twice the frequency of f
0
. Figure 6.3 is drawn
to show such a wave at 4 Hz, twice the fundamental frequency in figure 6.2. In
musical terms, H
2
sounds an octave higher than the fundamental frequency,
f
0
. Note, however, that each half-string in figure 6.3 moves less air than the whole
string of figure 6.2, so the amplitude of H
2
is less than that of H
1
.
92 •
HOW THE BRAIN EVOLVED LANGUAGE
Dostları ilə paylaş: |