14. Global Etymologies
285
excluded from his Indo-European family.
2
For Indo-European, and for the
Indo-Europeanists who came to dominate historical linguistics, the problem
of classification was essentially resolved by Jones, and the later additions of
a few more obvious branches, such as Tocharian and Anatolian, did not alter
this state of affairs.
The problems that Greenberg confronted, however, when he set out to clas-
sify the languages of Africa, were quite different from those facing a historical
linguist investigating an already-defined family. Greenberg was confronted by
over 1,000 languages, only some of which fit into well-defined families (e.g.
Semitic, Bantu), and among which there was little understanding of the rela-
tionships. Under these circumstances, where does one start? Obviously the
only way to begin is by the comparison of basic lexical items and grammatical
formatives in all the languages, which inevitably leads to a classification of
the languages into a certain number of groups defined by recurring similari-
ties. This is exactly what Jones had done when he identified Indo-European,
stressing, as he did, “a stronger affinity, both in the roots of verbs and in
the forms of grammar, than could possibly have been produced by accident.”
He said nothing of sound correspondences or reconstruction, for in fact these
concepts came to prominence (despite the earlier work of Rask, Grimm, and
Bopp) only in the second half of the nineteenth century.
We believe, in short, that there is really no conflict between Greenberg’s
method of classifying languages and what is often referred to rather inexplic-
itly as “the standard methodology.” The standard methodology is used to
investigate family-internal problems; it does not—at least as it is explained in
the basic textbooks referred to above—tell one how to identify language fam-
ilies. Accordingly, it does not tell one how to classify the world’s languages.
This, rather, is what Greenberg’s work does, and it is, furthermore, how
Greenberg views what he does. It has recently been alleged that he himself
subscribes to the view that his methods differ from the standard methodol-
ogy: “Greenberg (1987) makes clear that he believes such groupings [as Al-
taic, Hokan, and Amerind] cannot be reached by the standard comparative
method; a wholly different method, mass comparison, is required” (Nichols
1990: 477). That this is, in fact, exactly the opposite of Greenberg’s views is
shown in the following:
Statements from certain American Indianists that I have rejected comparative lin-
guistics and have invented a new unorthodox method called mass or multilateral
comparison are repeated again and again in the press. However, as I clearly stated
in Greenberg (1987: 3), once we have a well-established stock I go about comparing
and reconstructing just like anyone else, as can be seen in my various contributions
to historical linguistics. However, as I pointed out long ago in regard to my gen-
erally accepted African classification, the first step has to be to look very broadly,
2
The term Indo-European was not introduced until the nineteenth century.
286
14. Global Etymologies
on at least a continent-wide scale, to see what the obvious groupings are. How
can one start to apply the comparative method until one knows what to compare?
(Greenberg 1990: 8)
RECONSTRUCTION
It is remarkable how frequently reconstruction is confounded with tax-
onomy. For a moment’s reflection should make it clear that one can only
begin reconstructing a proto-language after one has decided which languages
belong to the putative family. Until one has delineated a set of seemingly re-
lated languages, collectively distinct from all others, by the methods outlined
at the outset of this chapter, there is simply nothing to reconstruct. (After
the fact, of course, reconstruction and (re)classification may enjoy a fruitful
feedback.) And as for the supposed validating effect of reconstruction, would
anybody claim that a bad reconstruction invalidates a well-defined family such
as Indo-European? Or that a brilliant reconstruction could show that Slavic,
Ob-Ugric, and Basque form a valid family? As a process, reconstruction is
entirely different from taxonomy, and the two should not be confused. It is for
this reason that Bynon’s claim that Greenberg uses multilateral comparison
as a “substitute” for reconstruction really makes no sense, and it is certainly
not anything that Greenberg has ever written or said or even suggested.
SOUND CORRESPONDENCES
Perhaps the greatest source of confusion in recent taxonomic debates has
been the role that sound correspondences, for example Grimm’s Law, play
in classification. It is clear that many historical linguists see regular sound
correspondences as playing some crucial role in identifying valid linguistic
taxa. In reality, sound correspondences are discovered only after a linguistic
family has been identified, for the simple reason that sound correspondences
are properties of particular linguistic families. They are not—and could not
be—a technique for discovering families. When the Indo-European sound
correspondences were worked out in the nineteenth century, not for a minute
did any of the Indo-Europeanists imagine that they were “proving” Indo-
European, the validity of which had not been in doubt for decades.
There are several reasons why sound correspondences have become en-
meshed with taxonomic questions. First, it is sometimes alleged that it is
only by means of regular sound correspondences that borrowings can be dis-
criminated from true cognates. It has long been recognized, however, that
loanwords often obey regular sound correspondences as strictly as do true
cognates, a point emphasized on several occasions by Greenberg (1957, 1987).
Campbell (1986: 224) makes the same point: “It ought to be noted that such
14. Global Etymologies
287
agreements among sounds frequently recur in a number of borrowed forms,
mimicking recurrent sound correspondences of true cognates.”
Another alleged use of sound correspondences is to discriminate superficial
look-alikes from true cognates (see the quote by Bynon above), and cognates,
it is claimed, do not look alike and can only be recognized by means of sound
correspondences. Thus, the commonly accepted Indo-European sound cor-
respondences show that Armenian erku ‘2’ and Latin duo ‘2’ are cognate,
despite their different form, whereas English bad and Farsi bad are not cog-
nate, despite their identity of form. Campbell has aptly criticized such views:
Identical or very similar sound matchings do not necessarily imply loans or weak
evidence of genetic connection. . . . With a time depth approaching that of the
Indo-European languages of Europe, the Mayan correspondences are on the whole
identical or are the result of single natural and recurrent changes. Proto-Mayan *p,
*m, *n, and *y are reflected unchanged, with identical correspondences, in all of
the over thirty Mayan languages. All other correspondences are very similar. Even
English, after its many changes, reflects Proto-Indo-European *r, *l, *m, *n, *s, *w,
and *y unchanged, on the whole.
A quick survey of once-disputed but now established remote genetic relationships
reveals that identical (or very similar) sound correspondences are not that unusual
. . . .
Therefore, identical correspondences should not be shunned nor too speedily at-
tributed to borrowing. While longer separation may provide greater opportunity for
unusual and exotic correspondences to develop in cases of distant genetic relation-
ship, it is in no way necessary for such developments to have taken place nor for
correspondences to be non-identical” (1986: 221–23).
Indeed, when one looks at the reconstructions that have been proposed for
almost any family, one is able to find modern languages that preserve the pro-
posed ancestral forms virtually unchanged. To cite just a few examples, Proto-
Indo-European *n¯epot- ‘nephew, son-in-law’ is strikingly similar to modern
Rumanian nepot, and Proto-Indo-European *m¯
us ‘mouse’ was preserved with-
out change in Latin, Old English, and Sanskrit. Proto-Austronesian *sepat
‘2’ is almost identical with Rukai sepate, and Proto-Autronesian *mat
s
a ‘eye’
is identical with Rukai mat
s
a. Proto-Uralic *tule ‘fire’ is preserved in Finnish
tule-, and Proto-Uralic *mo´ska ‘to wash’ differs little from Estonian m˜
oske-.
At an even greater time depth, we find that Proto-Nostratic *nato ‘female rela-
tion by marriage’ has survived, in Uralic, as Finnish nato ‘husband’s or wife’s
sister’ and, in Dravidian, as Malayalam n¯
att¯
un ‘husband’s sister, brother’s
wife,’ while Proto-Nostratic *p
h
alV ‘tooth’ survives in Dravidian as Telugu
palu and in Altaic as Ulch palu. At a time depth perhaps even greater than
that of Nostratic, we find Proto-Australian *buÑku ‘knee’ preserved in Dyirbal
buÑku.
In the etymologies we present below, connecting all of the world’s language
families, the situation is not all that different from that within the families
288
14. Global Etymologies
just discussed. There are, in fact, many examples of sound correspondences
of the transparent variety discussed by Campbell. This initial stage of the
analysis is necessarily characterized by the identification of easily recognizable
similarities, just as was the discovery of Indo-European or any other family.
The refinement represented by exotic sound correspondences of the erku–
duo variety inevitably awaits a later stage in the analysis—the second stage,
which we have called “historical linguistics.” And it is important to recognize
that the work of this stage leads almost invariably to a refinement of the
etymologies, rather than a refinement of the classification.
Among the world’s language families, there are no doubt exotic sound
correspondences as well that we have not detected. It should be noted, nev-
ertheless, that as early as 1986 one of us (Bengtson) proposed some global
sound correspondences, and the Russian scholar Sergei Starostin (1991) has re-
cently published the most explicit statement of interphyletic sound correspon-
dences to date. His brief table of Nostratic–Dene-Caucasian correspondences,
though not quite global in scope, accounts for a vast expanse of the linguistic
world. Nostratic, for Starostin, includes ten of our 32 taxa (Kartvelian, Indo-
European, Uralic, Dravidian, Turkic, Mongolian, Tungus, Korean, Japanese-
Ryukyuan, and Eskimo-Aleut), and Dene-Caucasian, for Starostin, includes
Caucasian, Sino-Tibetan, Yeniseian, and Na-Dene—to which one may confi-
dently add both Basque and Burushaski (Bengtson 1991a,b). Thus, Starostin’s
equations account for roughly half of our 32 taxa, as well as the vast majority
of the Eurasian land mass. We find nothing in Starostin’s correspondences
that is inconsistent with the etymologies proposed below.
ON THE LIMITS OF THE COMPARATIVE METHOD
It has recently been widely asserted that the comparative method in linguis-
tics produces reliable results only for the past 5,000–10,000 years. According
to Kaufman (1990: 23), “A temporal ceiling of 7,000 to 8,000 years is inherent
in the methods of comparative linguistic reconstruction. We can recover ge-
netic relationships that are that old, but probably no earlier than that. The
methods possibly will be expanded, but for the moment we have to operate
within that limit in drawing inferences.” Similar statements from a host of
other scholars are given in Chapter 11, where such beliefs are identified as the
central myth of historical linguistics (Chapter 13 further analyzes such myths).
The origin of this myth, we believe, is an attempt by Indo-Europeanists to
“explain” why Indo-European has no known genetic connections—in our view
yet another myth. The fact that Indo-European is intimately connected with
numerous other families has been demonstrated beyond a reasonable doubt
by the Russian Nostraticists (Illich-Svitych 1971–84), a demonstration that is
complemented and extended by Greenberg (to appear).
14. Global Etymologies
289
We have shown that in numerous cases sounds (particularly stable ones
like nasal consonants and liquids)—and even entire words—have persisted
over time spans greater than 8,000 years virtually unchanged. This raises
the question why these evidently quite stable sounds must suddenly change
beyond recognition, or disappear entirely, beyond the supposedly insuperable
threshold of 10,000 years. If we can use modern languages to reconstruct
proto-languages that existed at least 6,000–8,000 years ago (e.g. Proto-Indo-
European, Proto-Uralic, Proto-Dravidian, Proto-Austronesian), why cannot
such earlier languages themselves be compared (as in fact we will do) in order
to discern still earlier groupings? Would it not be one of the more remarkable
coincidences in the history of science if Indo-European, the family in terms of
which comparative linguistics was discovered, turned out to define the tempo-
ral limit of comparative linguistics as well? That there is no such coincidence
is amply demonstrated in the etymologies we give below. We feel it is time
for linguists to stop selling the comparative method short and to apply it
consistently to the world’s linguistic taxa, without preconception. The present
chapter represents a step in this direction, an initial step that shows that all of
the world’s populations are linguistically connected. The culmination of these
efforts will be a comprehensive subgrouping of this single linguistic family.
BAD SEMANTICS
Another criticism of global etymologies in particular, and of long-range
comparison in general, is that such liberties are taken with semantic change
that literally anything can be connected with anything else, and it is certainly
true that many global etymologies proposed over the years have been semanti-
cally unconvincing. But for just that reason we have constrained the semantic
variation of each etymology very tightly, and few of the semantic connections
we propose would raise an eyebrow if encountered in any of the standard ety-
mological dictionaries. They are in fact semantically more conservative than
many proposed connections in Pokorny (1959), the standard Indo-European
etymological dictionary. Whatever damage this often alleged defect may have
done to earlier programs of long-range comparison, we believe that it does not
affect the etymologies presented below.
ERRORS IN THE DATA
Another often-cited criticism of long-range comparison is the presence of
errors in the data, errors that invalidate the overall hypothesis. This is a
specious argument, for it ignores both common sense and the standard mea-
sures of statistical significance. Genetic classification is not analogous to a
mathematical proof, wherein one false step undermines the complete demon-
290
14. Global Etymologies
stration. Rather, the cumulative weight of all the evidence completely swamps
the effects of whatever random errors may be scattered through the work. As
Greenberg has often stressed—and has in fact shown in his work—multilateral
comparison yields valid genetic classifications even from decidedly degenerate
data. An example was Greenberg’s classification of Australian languages in
1953, using little more than the vocabularies published by E. M. Curr in
1886–87. The notion that data must be pristine and copious flies in the face
of commonly accepted historical method. It is all well and good for Kaufman
(1990: 18) to demand at least 500 items of basic vocabulary and 100 points
of grammar before “serious comparative work” can be carried out, but the
fact remains that Indo-Europeanists have classified Lydian as Indo-European,
without dissent, on the basis of a handful of words, as noted by Greenberg
(1990: 10). Similarly, David Payne (1991: 362) reports that “all that remains
of the [Shebayo] language is a vocabulary list of fifteen words collected at the
end of the 17th century. . . . Despite the paucity of data from this language,
it is quite clear that it is Arawakan.” Historians and historical linguists—not
to mention paleontologists working from handfuls of bashed fossils—use what-
ever material is available; they do not demand that the evidence be complete
or immaculate.
DISTRIBUTIONAL DIFFERENCES
It is often alleged that one can find anything in linguistic data if one looks
for it hard enough. Thus the global etymologies we present below are a tribute
more to our industry and enterprise than to real genetic connections. Such a
view is widespread among linguists who have never actually compared large
numbers of languages (or language families), but those of us who have done
this kind of work know the reverse to be true. “Wanting” to find something
is of very little help if it is not there. Greenberg (1987) points out that the
Amerind family has two general words for females, tuna ‘girl’ and kuna
‘woman.’ Both roots are abundantly attested throughout North and South
America, and both are found in all eleven branches of the Amerind family.
What is interesting about their distribution, however, is that whereas kuna
is widely attested in the Old World, as we show in etymology 11 below, we
have found no trace of tuna in the Old World. If it were really so easy to find
anything one looks for, why did we fail to find tuna in the roughly 4,500 Old
World languages, when it is so readily observed in the approximately 500 New
World languages? The evolutionary analysis provides a simple and natural
explanation: when the Amerind forebears first entered the New World they
brought with them the word kuna ‘woman,’ and only later did they invent
the word tuna ‘girl.’ That there is no trace of tuna ‘girl’ in the Old World
is because it never existed there.
14. Global Etymologies
291
GLOBAL ETYMOLOGIES
For each etymology, in what follows, we present a phonetic and semantic
gloss,
3
followed by examples from different language families. Though we
have not attempted to present a unified phonetic transcription for all sources,
we have adjusted certain transcriptions from time to time to avoid potential
ambiguity. In the first etymology (but not elsewhere) yod has been normalized
to j in all citations. Ejectives have been normalized to p’, t’, k’, etc.; V
represents a vowel of indeterminate timbre; ˘ı is used for the Old Church Slavic
soft sign and ˘
u for the hard sign; and
∼ separates alternative forms. In the
two interrogative etymologies (10, 17), interrogative and relative uses are not
distinguished (‘who?’ as in “Who is that man?” vs. ‘who’ in “The man who
came to dinner.”). The intimate connection between the two is well known
and uncontroversial. Most of the cited forms are, however, true interrogatives.
The source of the information for each family represented in a given entry is
indicated by an abbreviation in brackets at the end of the entry. The number
following the abbreviation is either the etymology number in the original
source (if there is one) or the page number there. Since the existence of these
roots as characteristic features of the language families cited has already been
established by other scholars, and is not for the most part in question, we
do not give the complete documentation for each family, limiting ourselves
in most instances to an indication of the range of semantic and phonological
variation within the family. The reader who wishes to see every relevant form
for a given family should consult the sources cited. For Amerind, however,
we give extensive citations, in order to counterbalance the fallacious criticism
that has been directed at Greenberg’s work. Parts of etymologies that are
problematic, by dint of either phonetic or semantic divergence, or by restricted
distribution, are preceded by a question mark. The lack of a semantic gloss
following a form means that that form has the same meaning as the preceding
form.
We make no claim to being the first to discover any of the etymologies
listed below. The pioneering work of Trombetti, Swadesh, Greenberg, Illich-
Svitych, Dolgopolsky, and Starostin has identified numerous widespread roots.
What we have tried to do is to make each etymology more complete and more
soundly documented in this incarnation than it may have been in previous
ones. With this goal in mind we have weeded out certain families from pre-
3
We do not deal here with reconstruction, and these glosses are intended merely to
characterize the most general meaning and phonological shape of each root. Future work
on reconstruction will no doubt discover cases where the most widespread meaning or shape
was not original.
292
14. Global Etymologies
vious proposals, where the root was phonologically or semantically too di-
vergent, or too weakly attested, to be convincing. But we believe we have
also uncovered some additional etymological connections that had previously
gone unnoticed. To a very great extent the recognition of these similarities
has been made possible by the lower-level classificatory work of Greenberg Dostları ilə paylaş: |