Oxford university press how the br ain evolved language



Yüklə 2,9 Mb.
Pdf görüntüsü
səhifə6/18
tarix06.06.2020
ölçüsü2,9 Mb.
#31814
1   2   3   4   5   6   7   8   9   ...   18
how the brain evolved language


Figure 4.11. 
Columnar organization in neocortex. (Eccles 1977, after Szentágothai 
1969. Reprinted by permission of McGraw-Hill Book Company.) 
(Spec. aff. on figure 4.11) arise into the neocortical sheet, innervating smallish 
stellate cells (S
n
), and defining a column. One supposes that it is difficult for a 
small synaptic input from the afferent axon collateral of a distant neuron to trig­
ger a response in a single large pyramidal cell. Rather, the small input triggers a 
chain reaction among the smaller stellate cells. This chorus then excites the large 
pyramidal cell of the column. When a column becomes thus innervated, the 
pyramidal cell eventually reaches threshold and generates the column’s output: 
a volley of spikes is sent along the axon to many other distant columns. 
Szentágothai’s schematic of this organization (figure 4.11) was developed 
after experiments by Mountcastle (1957) which demonstrated that neocortex 
responded better to electrodes placed perpendicular to the cortical sheet (line 
P in figure 4.12) than to electrodes inserted obliquely (O in figure 4.12). 
Independently of the work of Mountcastle and Szentágothai, Hubel and 
Wiesel popularized use of the term “column” in another sense (which we will 
encounter in chapter 5, figure 5.4f), so researchers began instead to use the term 
“barrel” to refer to the columns of figure 4.11 (Welker et al. 1996). In this meta­
phor, we think of a single, afferent axon as defining the center of a neural bar­
rel. Within the barrel, a number of excitatory pyramidal and stellate cells become 
activated by the input, as well as some number of basket (large dark cells in 
Szentágothai’s drawing) and chandelier cells (absent in Szentágothai’s drawing), 
which inhibit surrounding barrels.
6
 In this view, the barrel is more of a statistical 
entity, a kind of distribution of the probability of an afferent axon innervating 
excitatory and inhibitory cells. 

72  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
Figure 4.12. 
Perpendicular, not oblique, stimuli activate neocortex. 
In either view, we pause to ask what stops the inhibitory cells “in the barrel” 
from inhibiting the excitatory cells in the barrel. That is, what stops the barrel 
from committing neural suicide? The answer lies in inspection of the lateral 
extent of the axon collaterals of the inhibitory cells (figure 4.8). If we stipulate 
that these collaterals cannot consummate the act of synapsing until they reach a 
kind of neural puberty, then they can be prevented from synapsing with pyrami­
dal cells in their own barrel. This leads directly to the “planar” view of cortex. 
Planar Organization 
In the planar view of cortex, we look down upon the cortical sheet as in the 
surgical view, but we look more closely. Each afferent input defines the on-
center of a barrel, and surrounding that on-center are two concentric rings. 
Like a pebble dropped in a still pool, there is an on-center peak at the point of 
impact, and waves ripple out from it. The innermost, inhibitory wave follows a 
Gaussian probability distribution: it peaks at the radius where axons of most of 
the barrel’s inhibitory cells “reached puberty” and began to form synapses.

The outermost, excitatory wave follows a Gaussian probability distribution that 
peaks at the radius where the barrel’s excitatory cells reached puberty. These 
waves do not simply spread and dissipate, however. They interact in complex 
patterns with the waves of other barrels. 
One of the first researchers to take this planar view and explore these com­
plex patterns was von der Malsburg (1973). Using a variant of the modeling 
equations developed by Grossberg (1972a), von der Malsburg constructed a pla­

THE  SOCIETY  OF  BRAIN 

73
 
Figure 4.13. 
Von der Malsburg’s planar cortex. (Von der Malsburg 1973. Reprinted 
by permission of Springer-Verlag.) 
nar computer model of striate cortex (figure 4.13). Von der Malsburg’s simula­
tion used an on-center off-surround architecture to recognize inputs. Early neural 
network models simply sent the excitatory output of some individual “neurode” 
directly to some other neurode. Von der Malsburg essentially added the off-
surround, inhibitory cells that were missing in figure 4.11. When stimulated, each 
barrel now increased its own activity and decreased that of its neighbor. 
Missing, however, from von der Malsburg’s model was the fact that in neo­
cortex, barrels also send long-distance, excitatory pyramidal cell output to many 
other barrels. They also receive reciprocal excitatory feedback from those other 
barrels. In the next chapter we will build and test a neocortical model that adds 
these missing elements to the planar model. 

74

HOW THE BRAIN EVOLVED LANGUAGE
• 



E
• 
Adaptive Resonance
One cannot step into the same river twice. 
Heraclitus 
In chapters 3 and 4, we glimpsed the marvelous biochemical and anatomical 
complexity of the human brain. But in a single breath of a summer wind, a 
million leaves turn and change color in a single glance. The mind need not 
read meaning into every turning leaf of nature, but neither the hundreds of 
neurochemical messengers of chapter 3 nor the forty-odd Brodmann areas of 
chapter 4 can begin to tally the infinite complexity of an ever-changing envi­
ronment. To gain even the smallest evolutionary advantage in the vastness of 
nature, a brain must combinatorially compute thousands and millions of pat­
terns from millions and billions of neurons. In the case of Homo loquens, as we 
estimated in chapter 1, the competitive brain must be capable of computing 
something on the order of 10
7,111,111
 patterns. 
But how can we begin to understand a brain with 10
7,111,111
 possible con­
figurations? As the reader by now suspects, our technique will be to study 
minimal anatomies—primitive combinations of small numbers of synapses. 
First we will model the behavior of these minimal anatomies. Then we will 
see how, grown to larger but self-similar scales, they can explain thought and 
language. 
We have already seen several minimal anatomies. In chapter 2 we some­
what fancifully evolved a bilaterally symmetrical protochordate with a six-
celled brain. Then, in chapter 4, we touched upon Hartline’s work detailing 
the horseshoe crab’s off-center off-surround retina and sketched a preview 
of the on-center off-surround anatomy of the cerebrum. Learning by on-
center off-surround anatomies has been the focus of Grossberg’s adaptive reso-
nance theory (ART), and it is from this theory that we now begin our approach 
to language. 
74 

ADAPTIVE  RESONANCE 

75
 
From Neocortex to Diagram: Resonant On-Center 
Off-Surround Anatomies 
Figure 5.1a is a reasonably faithful laminar diagram of neocortex, but for sim­
plicity each barrel is modeled by a single excitatory pyramidal cell and a single 
inhibitory cell. Afferent inputs arise from the white matter beneath the cortex 
and innervate the barrels. A single fine afferent axon collateral cannot by it­
self depolarize and fire a large pyramidal cell. So figure 5.1a has the afferent 
fiber fire smaller, stellate cells first. These stellate cells then fire a few more 
stellate cells, which each innervate a few more stellate cells, and so on. Even­
tually, by this kind of nonlinear mass action, an activated subnetwork of stel­
late cells fires the barrel’s large pyramidal and inhibitory cells. The on-center 
pyramidal cell sends long-distance outputs, while the inhibitory cell creates an 
off-surround. 
Figure 5.1. 
Three schematics of on-center off-surround anatomies. (a) is a biologi­
cally faithful schematic detailing pyramidal cells and inhibitory basket cells. (b) and 
(c) abstract essential design elements.

76  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
In figure 5.1b, we abstract away from Figure 5.1a, and we no longer explic­
itly diagram inhibitory cells. Following White 1989, we also treat stellate cells as 
small pyramidal cells, so each node in 
2
 of figure 5.1b can be interpreted as 
either a local subnetwork of stellate-pyramidal cells or a distal network of pyra­
midal cells. In either case, 
1
 remains an on-center off-surround minimal anatomy. 
Figure 5.1c abstracts still further, no longer explicitly diagramming the on-
center resonance of 
2
 nodes. In the diagrams of minimal anatomies that fol­
low, it is important that the reader understand that a circle can stand for one 
cell or many, while “on-center loops” like those in figure 5.1c can represent 
entire, undiagrammed fields of neurons. Since we will focus almost exclusively 
on cerebral anatomies, and since the on-center off-surround anatomy is ubiq­
uitous in neocortex, I will often omit even the on-center loops and off-surround 
axons from the diagrams. 
Gated Dipole Rebounds 
In a series of papers beginning in 1972, Grossberg reduced the on-center off-sur-
round architecture of figure 5.1 to the gated dipole minimal anatomy. This, in turn, 
led to a series of remarkable insights into the structure and functioning of mind. 
Consider, for example, the rather familiar example at the top of figure 5.2: 
stare at the black circles for fifteen seconds (longer if the lighting is dim). Then 
close your eyes. An inverse “retinal afterimage” appears: white circles in a black 
Figure 5.2. 
A McCollough rebound occurs by switching the gaze to the lower pane 
after habituating to the upper pane. 

ADAPTIVE  RESONANCE 

77
 
field!
1
 Although this percept is often called a “retinal afterimage,” it arises 
mainly in the lateral geniculate nucleus of the thalamus and neocortex (Livings­
tone and Hubel 1987). If, while staring at figure 5.2, a flashbulb suddenly in-
creases the illumination, an inverse image also appears—and it can occur during 
as well as after image presentation. (If you don’t have a flashbulb handy, you 
can simulate this effect by staring at figure 5.2 and then abruptly shifting your 
gaze to the focusing dot in the center of the all-white at the bottom of figure 
5.1.) Both decreasing illumination (closing the eyes) and increasing illumina­
tion (the flashbulb effect) can create inverse percepts, and this can happen 
during, as well as after, a sensation. We can account for all of these effects with 
the minimal anatomy in figure 5.3. 
Figure 5.3. 
The McCollough effect. A red-green gated dipole: (a) With white-light 
input, both poles are active. (b) With red input, the red pole is active and neuro­
transmitter depletes at the r
0
–r
1
 synapse. (c) Closing the eyes allows background 
activity in the green pole to dominate competition and produce a retinal after­
image. (d) Alternatively, NSA (e.g., a flash of white light) can produce an after­
image rebound, even while red input is maintained. 

78  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
In the human visual system, black and white, blue and yellow, and red and 
green response cells are all arrayed in gated dipoles. This leads to a group of 
phenomena collectively known as the McCollough effect (McCollough 1965; see 
also Livingstone and Hubel 1987). Under white light, as schematized in figure 
5.3a, red and green receptor cells compete to a standoff. White is perceived, but 
no red or green color percept is independently output. In figure 5.3b, red light 
illuminates the dipole. The red pole inhibits the green pole via the inhibitory 
interneuron i
rg
, so only the red pole responds, and a red percept is output from 
r
2
. After protracted viewing under intense illumination, however, neurotransmit­
ter becomes depleted at the r
0
r

synapse, and the dipole becomes unbalanced. 
Neurons maintain a low level of random, background firing even in the absence 
of specific inputs, so if specific inputs are shut off (e.g., if the eyes are closed), as 
in figure 5.3c, then the green pole will come to dominate the depleted red pole 
in response to background activation. On the other hand, in figure 5.3d, a burst 
of white light stimulates the unbalanced dipole. Because this burst of white light 
contains equal amounts of red and green light, it is an example of what ART 
calls nonspecific arousal (NSA). Even if the original red input is maintained dur­
ing this burst, so more red than green remains in the total stimulus spectrum, 
the green pole still gains control because of its greater neurotransmitter reser­
2
voir at synapse g
0
g
1
.
One of Sherrington’s many contributions to the understanding of the 
nervous system was his description of neuromuscular control in terms of ago-
nist-antagonist competition. After Sherrington, “antagonistic rebounds,” in 
which an action is reflexively paired with its opposite reaction, began to be 
found everywhere in neurophysiology. Accordingly, Grossberg referred to 
events like the red-green reversal of figure 5.3 as “antagonistic dipole rebounds.” 
Mathematical Models of Cerebral Mechanics 
Grossberg analyzed the gated dipole and many related neural models math­
ematically, first as a theory of “embedding fields” and then as the more fully 
developed Adaptive Resonance Theory. Our purpose in the following chap­
ters will be to analyze similar neural models linguistically, but the preceding 
example offers an opportunity to make a simplified presentation of Grossberg’s 
mathematical models. One of the advantages of mathematical analysis is that, 
although it is not essential to an understanding of adaptive grammar, it can 
abstract from the flood of detail we encountered in the previous chapters and 
bring important principles of system design into prominence. A second rea­
son to develop some mathematical models at this point is that in the second 
half of this chapter we will use them to build a computer model of a patch of 
neocortex. We will then be better prepared to explore the question of how 
language could arise through and from such a patch of neocortex. Finally, many 
of the leading mathematical ideas of ART are really quite simple, and they can 
give the nonmathematical reader a helpful entry point into the more mathe­
matical ART literature. 

ADAPTIVE  RESONANCE 

79
 
ART Equations 
The central equations of Grossberg’s adaptive resonance theory model a short-
term memory tracex, and a long-term memory tracez. Using Grossberg’s basic no­
tation, equations 5.1 and 5.2 are differential equations that express the rate of 
change of short-term memory (x in 5.1) and long-term memory (z in 5.2): 
x
[
j
 = –Ax
j
 + Bx
i
z
ij 
(5.1) 
z
ij
 = –Dz
ij
 + Ex
i
x

(5.2) 
Equation 5.1 is a differential equation in dot notation. It describes the rate of 
change of a short-term memory trace, x
j
. We can think of this short-term memory 
trace as the percentage of Na
+
 gates that are open in a neuron’s membrane. 
In the simplest model, x
j
 decreases at some rate A. We can say that is the rate 
at which the neuron (or neuron population) x
j
 “forgets.” Equation 5.1 also states 
that x
j
 increases at some rate B. The quantity is one determinant of how fast 
x
j
 depolarizes, or “activates.” We can think of this as the rate at which x
j
 “learns,” 
but we must remember that here we are talking about short-term learning. Per­
haps we should say determines how fast x
j
 “catches on.” 
The rate at which x
j
 catches on also depends upon a second factor, z
ij
, the 
long-term memory trace. We can think of this long-term memory trace as the 
size of the synapse from x
i
 to x
j
 (cf. the synapses in figure 5.3). Changes in z
ij 
are modeled by equation 5.2. Equation 5.2 says that z
ij
 decreases (or forgets) 
at some rate D, and that z
ij
 also increases at some rate E, a function of x

times x
j

We can say that z
ij
 “learns” (slowly) at the rate E
It is important to note that ABD, and E are all shorthand abbreviations 
for what, in vivo, are very complex and detailed functions. The rate B, for ex­
ample, lumps all NMDA and non-NMDA receptor dynamics, all glutamate, 
aspartate, and GABA neurotransmitters, all retrograde neurotransmitters and 
messengers, all neurotransmitter release, reuptake, and manufacture processes, 
membrane spiking thresholds, and who knows what else into a single abstract 
function relating barrel x
j
 to barrel x
i
 across synapse z
ij
. This may make B seem 
crude (and undoubtedly it is), but it is the correct level of abstraction from 
which to proceed. 
It is also important to note that, by self-similarity, ART intends x
j
 and z
ij
 to 
be interpretable on many levels. Thus, when speaking of an entire gyrus, x

might correlate with the activation that is displayed in a brain scan. When speak­
ing of a single neuron, x
j
 can be interpreted as a measure of the neuron’s ac­
tivation level above or below its spiking threshold. When speaking of signal 
propagation in the dendritic arborization of a receptor neuron, x
j
 can be 
interepreted as the membrane polarization of a dendritic branch. In these last 
two cases, a mathematical treatment might explicitly separate a threshold func­
tion 
Γ
out from equation 5.1, changing +Bx
i
z
ij
 into something like + B
Γ
(x
i
,   z
ij
). 
Usually, however, ART equations like 5.1 and 5.2 describe neural events on the 
scale of the barrel or of larger, self-similar subnetworks. At these scales, x
i
 may 

80  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
have dozens or thousands of pyramidal output neurons, and at any particular 
moment, 3 or 5 or 5000 of them may be above spiking threshold. The subnet­
work as a whole will have a quenching threshold, above which the activity of neu­
rons in the subnetwork will be amplified and below which the activity of 
neurons in the subnetwork will be attenuated. But the subnetwork as a whole 
need not exhibit the kind of “all-or-none” discrete spiking threshold that has 
been claimed for individual neurons, so ART equations do not usually elabo­
rate a term for thresholds. Instead, they use nonlinear gating functions. 
Nonlinearity 
There is only one way to draw a straight line, but there are many ways to be 
nonlinear, and nonlinearity holds different significance for different sciences. 
ART equations are nonlinear in two ways that are especially important to cog­
nitive modeling: they are (1) sigmoidal and (2) resonant. 
The curves described by ART equations and subfunctions (like ABD, and 
E above) are always presumed to be sigmoidal (
∫-shaped). That is to say, they 
are naturally bounded. For example, a neuron membrane can only be activated 
until every Na
+
 channel is open; it cannot become more activated. At the lower 
bound, a membrane can be inhibited only until every Na
+
 channel is closed; it 
cannot become more inhibited.
3
 So x
j
 has upper and lower limits, and a graph 
of its response function is sigmoidal. In equation 5.2, z
ij
 is similarly bounded 
by the sigmoidal functions D and E
z
Equations 5.1 and 5.2 also form a nonlinear, resonant system. The LTM trace 
ij
 influences x
j
, and x
j
 influences z
ij
. Both feedforward and feedback circuits 
exist almost everywhere in natural neural systems, and feedforward and feed­
back circuits are implicit almost everywhere in ART systems: for every equa­
tion 5.1 describing x
j
’s response to x
i
, there is a complementary equation 5.1' 
describing x
i
’s reciprocal, resonant, response to x
j
.
4
 This is the same kind of 
nonlinearity by which feedback causes a public address system to screech out 
of control, but equations 5.1 and 5.2 are bounded, and in the neural systems 
they describe, this kind of feedback makes rapid learning possible. 
Shunting 
The fact that the terms of ART equations are multiplicative is an important 
detail which was not always appreciated in earlier neural network models. 
Imagine that table 5.1 presents the results of four very simple paired associate 
learning experiments in which we try to teach a parrot to say ij (as in “h-i-j-k”). 
In experiment A, we teach the parrot i and then we teach it j. The parrot learns 
ij. In experiment B, we teach the parrot neither i nor j. Not surprisingly, the 
parrot does not learn ij. In experiment C, we teach the parrot i, but we do not 
teach it j. Again, it does not learn ij. In experiment D, we do not teach the 
parrot i, but we do teach it j. Once again, it does not learn ij

ADAPTIVE  RESONANCE 

81
 
TABLE  5.1

Truth table for logical AND 
(multiplication). 


C




1




0

Learned? 




The not-very-surprising results of our parrot experiment clearly reflect the 
truth table for multiplication. So ART computes learning by multiplying x

by x

in equation 5.2. Similarly, in equation 5.1, ART multiplies x
i
 by z
ij
. In the litera­
ture on artificial neural networks and elsewhere, engineers often refer to such 
multiplicative equations as shunting equations. 
Habituation 
In the psychological literature habituation is said to occur when a stimulus ceases 
to elicit its initial response. The rebound described in figure 5.3 is a common 
example of a habituation effect. Although the term is widely and imprecisely 
used, we will say that habituation occurs whenever neurotransmitter becomes 
depleted in a behavioral subcircuit. Neurotransmitter depletion can be de­
scribed with a new equation: 
n
if
 = + Kz
if
 – Fn
if
x

(5.3) 
Equation 5.3 states that the amount of neurotransmitter n at synapse ij grows 
at some rate K, a function of z
ij
, the capacity of the long-term memory (LTM) 
trace. By equation 5.3, n
ij
 is also depleted at a rate F, proportional to the pre­
synaptic stimulation from x
i
. Put differently, z
ij
 represents the potential LTM 
trace, and n
ij
 represents the actual LTM trace. Put concretely, at the scale of 
the single synapse, z
ij
 can be taken to represent (among other factors) the avail­
able NMDA receptors on the postsynaptic membrane, while n
ij
 represents 
(among other factors) the amount of presynaptic neurotransmitter available 
to activate those receptors. Given equation 5.3, equation 5.1 can now be elabo­
rated as 5.4: 
x
[
j
 = –Ax
j
 + B 
∑ n
if
x
j
 – C 
 n
kj
x

(5.4)
i

j
k


Equation 5.4 substitutes actual neurotransmitter, n
ij
, into the original Bx
i
z
ij
 term 
of equation 5.1. It then elaborates this term into two terms, one summing 
multiple excitatory inputs, +B

n
ij
x
j
, and the second summing multiple inhibi­
tory inputs, –C

n
kj
x
j
. This makes explicit the division between the long-distance, 
on-center excitatory inputs and the local, off-surround inhibitory inputs dia­
grammed in figure 5.1. 

82  • 
HOW  THE  BRAIN  EVOLVED  LANGUAGE 
Habituation, specific and nonspecific arousal, and lateral inhibition, as 
described in equations 5.2–5.4, give rise to a computer model of cerebral cor­
tex and a range of further cognitive phenomena, including noise suppression, 
contrast enhancement, edge detection, normalization, rebounds, long-term 
memory invariance, opportunistic learning, limbic parameters of cognition, 
P300 and N400 evoked potentials, and sequential parallel memory searches. 
These, in turn, form the cognitive basis of language. 
A Quantitative Model of a Cerebral Gyrus 
z
Figure 5.4a–e shows what happens when equations 5.2–5.4 are used to create 
a computer model of a cerebral gyrus.
5
 The model gyrus in figure 5.4 is twenty-
three barrels high by forty-eight barrels wide. Each barrel forms forty-eight 
excitatory synapses with other barrels at radius 3–4 and twenty-four inhibitory 
synapses with twenty-four barrels at radius 1–2. The gyrus is modeled as a closed 
system in the shape of a torus: barrels at the top edge synapse with the bottom 
edge, and barrels at the left edge synapse with the right edge. Each synapse’s 
ij
 and n
ij
 are modeled by equations 5.2 and 5.3, while each barrel’s x
j
 is mod­
eled by equation 5.4. Figure 5.4 displays the activation level of each barrel (its 
x
j
 value) according to the gray scale at the bottom: black is least active and white 
is most active. 
At time t = 0 in figure 5.4a, the gyrus is an inactive, deep-gray tabula rasa. 
Specific inputs I are applied to target nodes at [x y] coordinates [10 9], [10 
11], [10 13], and [10 15], and a black, inhibited surround begins to form. At 
t = 1 after another application of specific inputs to the target field, resonant 
activation begins to appear at radius 3–4 (figure 5.4b). 
Noise suppression 
At time t = 1 (figure 5.4b), the target nodes are activated above the level of the 
rest of the gyrus, and a black, inhibitory surround has formed around them. 
The inhibitory surround is graphic illustration of how noise suppression arises as 
an inherent property of an on-center off-surround system: any noise in the 
surround that might corrupt the on-center signal is suppressed. 
Contrast enhancement 
Figure 5.4b also illustrates contrast enhancement, a phenomenon similar to noise 
suppression. The target nodes at t = 1 are light gray; that is, their activity is con­
trastively “enhanced” above the background and the target nodes at t = 0 (fig­
ure 5.4a). This enhancement is not only due to repeated, additive specific 
inputs; it is also due to resonant feedback to the target node field from the 
emerging active fields at radius 3–4 and the lateral inhibition just described 
under noise suppression. 

ADAPTIVE  RESONANCE 

83
 
Yüklə 2,9 Mb.

Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9   ...   18




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin