14 Global Etymologies

Yüklə 356,86 Kb.

Pdf görüntüsü

səhifə	1/10
tarix	18.04.2020
ölçüsü	356,86 Kb.
	#30889

1 2 3 4 5 6 7 8 9 10

(125)ruhlen7

Global Etymologies

John D. Bengtson and Merritt Ruhlen

If the strength of Indo-European studies

is largely based on the existence,

in a few instances at least,

of very old sources, the strength

of Amerindian studies is simply

the vast number of languages.

Thus synchronic breadth becomes

the source of diachronic depth.

—Joseph H. Greenberg (1987)

How does one know that two languages are related? Or that two language

families are related? Every linguist purports to know the answers to these

questions, but the answers vary surprisingly from one linguist to another. And

the divergence of views concerning what is actually known is even greater than

that exhibited on the question of how one arrives at this body of information.

This is not a particularly satisfactory state of aﬀairs. In what follows we

will explore these questions in a global context. We conclude that, despite

the generally antipathetic or agnostic stance of most linguists, the case for

monogenesis of extant (and attested extinct) languages is quite strong. We

will present evidence that we feel can only be explained genetically (i.e. as

278

14. Global Etymologies

the result of common origin), but we will also attempt to answer some of the

criticism that has been leveled at work such as ours for over a century.

THE BASIS OF LINGUISTIC TAXONOMY

That ordinary words form the basis of linguistic taxonomy is a direct conse-

quence of the fundamental property of human language, the arbitrary relation-

ship between sound and meaning. Since all sequences of sounds are equally

well suited to represent any meaning, there is no tendency or predisposition

for certain sounds or sound sequences to be associated with certain meanings

(leaving aside onomatopoeia, which in any event is irrelevant for classiﬁca-

tion). In classifying languages genetically we seek, among the available lexical

and grammatical formatives, similarities that involve both sound and mean-

ing. Typological similarities, involving sound alone or meaning alone, do not

yield reliable results.

The fundamental principles of taxonomy are not speciﬁc to linguistics, but

are, rather, as applicable in ﬁelds as disparate as molecular biology, botany,

ethnology, and astronomy. When one identiﬁes similarities among molecular

structures, plants, human societies, or stars, the origin of such similarities can

be explained only by one of three mechanisms: (1) common origin, (2) borrow-

ing, or (3) convergence. To demonstrate that two languages (or language fam-

ilies) are related, it is thus suﬃcient to show that their shared similarities are

not the result of either borrowing or convergence. As regards convergence—

the manifestation of motivated or accidental resemblances—linguists are in a

more favorable situation than are biologists. In biology, convergence may be

accidental, but is more often motivated by the environment; it is not by ac-

cident that bats resemble birds, or that dolphins resemble ﬁsh. In linguistics,

by contrast, where the sound/meaning association is arbitrary, convergence is

always accidental.

It is seldom emphasized that similarities between language families are

themselves susceptible to the same three explanations. That we so seldom see

mention of this corollary principle is largely because twentieth-century histori-

cal linguistics has been laboring under the delusion that language families like

Indo-European share no cognates with other families, thus oﬀering nothing

to compare. At this level, it is alleged, similarities simply do not exist.

What is striking is that this position—for which considerable evidence to

the contrary existed already at the start of this century (Trombetti 1905) and

which on a priori grounds seems most unlikely (Ruhlen 1988a)—came to be

almost universally accepted by linguists, most of whom have never investi-

gated the question themselves. Those few scholars who have actually investi-

gated the question, such as Trombetti (1905), Swadesh (1960), and Greenberg

14. Global Etymologies

279

(1987), have tended to favor monogenesis of extant languages. Even Edward

Sapir, often considered an exemplar of linguistic sobriety (despite his alleged

excesses in the Americas), looked favorably upon the work of Trombetti, as

seen in a letter to Kroeber in 1924: “There is much excellent material and

good sense in Trombetti in spite of his being a frenzied monogenist. I am

not so sure that his standpoint is less sound than the usual ‘conservative’

one” (quoted in Golla 1984: 420). We maintain that a comparison of the

world’s language families without preconception reveals numerous widespread

elements that can only be reasonably explained as the result of common origin.

BORROWING

Linguists employ a number of well-known techniques to distinguish bor-

rowed words from inherited items. Most important, clearly, is the fact that

basic vocabulary, as deﬁned by Dolgopolsky (1964) and others, is highly resis-

tant to borrowing. Though it is no doubt true that any word may on occasion

be borrowed by one language from another, it is equally true that such basic

items as pronouns and body parts are rarely borrowed. Furthermore, borrow-

ing takes place between two languages, at a particular time and place, not

between language families, across broad expanses of time and place. Thus

to attribute the global similarities we document here to borrowing would be

ludicrous. And as regards the alleged cases of mass borrowing in the Amer-

icas (the so-called “Pan-Americanisms”), Greenberg (1990: 11) quite rightly

protests “that basic words and pronouns could be borrowed from Tierra del

Fuego to British Columbia . . . is so utterly improbable that it hardly needs

discussion.” It seems to us even less likely that basic vocabulary—the grist for

most of the etymologies we oﬀer herein—could have been borrowed from one

language to another all the way from Africa across Eurasia to South America.

CONVERGENCE

A common criticism of work like ours is that, with around 5,000 languages

to choose from, it cannot be too hard to ﬁnd a word in some African lan-

guage that is semantically and phonologically similar to, or even identical

with, some word in an American Indian language.

There are so many possi-

bilities, runs this argument, that one can hardly fail to ﬁnd accidental “look-

alikes” everywhere (Goddard 1979, Campbell 1988). But this sort of mindless

search is exactly the reverse of how the comparative method proceeds. The

units we are comparing are language families, not individual languages (a

language isolate like Basque has traditionally been considered, taxonomically,

For a more fundamental discussion of convergence, see Chapter 2.

280

14. Global Etymologies

a family consisting of a single language). Speciﬁcally, we will be compar-

ing items in the following 32 taxa, each of which we believe is a genetically

valid group at some level of the classiﬁcation: Khoisan, Niger-Congo, Kordofa-

nian, Nilo-Saharan, Afro-Asiatic, Kartvelian, Indo-European, Uralic, Dravid-

ian, Turkic, Mongolian, Tungus, Korean, Japanese-Ryukyuan, Ainu, Gilyak,

Chukchi-Kamchatkan, Eskimo-Aleut, Caucasian, Basque, Burushaski, Yeni-

seian, Sino-Tibetan, Na-Dene, Indo-Paciﬁc, Australian, Nahali, Austroasiatic,

Miao-Yao, Daic (= Kadai), Austronesian, and Amerind.

One may legitimately wonder why, for the most part, we are comparing

relatively low-level families like Indo-European and Sino-Tibetan rather than

higher-level taxa like Eurasiatic/Nostratic and Dene-Caucasian, especially

since both of us support the validity of these higher-level families (Bengtson

1991a,b, Ruhlen 1990a). We do this to emphasize that higher-level groupings

do not require the prior working out of all the intermediate nodes, contrary

to the opinion of most Amerindian specialists (the ﬁeld is all but bereft of

generalists!). As is well known, both Indo-European and Austronesian were

recognized as families from the early years of their investigation, long be-

fore specialists had reconstructed all their intermediate levels (a task that is,

of course, still incomplete). In taxonomy it is a commonplace that higher-

level groupings are often more obvious—and easier to demonstrate—than are

lower-level nodes. We maintain that this is particularly so when one consid-

ers the entire world. Current contrary opinion notwithstanding, it is really

fairly simple to show that all the world’s language families are related, as we

shall see in the etymologies that follow. Discovering the correct intermedi-

ate groupings of the tree—the subgrouping of the entire human family—is a

much more diﬃcult task, and one that has only begun. Exactly the same is

true of Amerind, which itself is a well-deﬁned taxon (Greenberg 1987, Ruhlen

1991a); the subgrouping within Amerind involves far more diﬃcult analyses

and taxonomic decisions (Ruhlen 1991c).

Each of our 32 genetic groups is deﬁned by a set of etymologies that

connects grammatical and lexical items presumed to be cognate within that

group; the postulated membership and putative subgrouping within each of

these groups is given in Ruhlen (1987a). The precise number of etymologies

deﬁning each of the 32 groups ranges from several thousand (for close-knit

and/or well-documented groups like Dravidian or Indo-European) to several

dozen (for ancient and/or poorly studied groups like Indo-Paciﬁc or Aus-

tralian). For the most part the many etymologies deﬁning each group have

been discovered independently, by diﬀerent scholars. (In this regard Green-

berg’s work—in Africa, New Guinea, and the Americas—represents an excep-

tion to the rule.) So instead of drawing our etymologies from thousands of

languages, each containing thousands of words, we are, rather, limited to less

14. Global Etymologies

281

than three-dozen families, some of which have no more than a few hundred

identiﬁable cognates. The pool of possibilities is thus greatly reduced, and

accidental look-alikes will be few.

We believe that the failure of our critics to appreciate the truly minuscule

probability of accidental similarities is the chief impediment to their under-

standing why all the world’s languages must derive from a common origin.

Accordingly, let us consider this question in some detail. Each of the etymolo-

gies we cite involves at least a half-dozen of the 32 supposedly independent

families, precisely because the probability of ﬁnding the same accidental re-

semblance in six diﬀerent families is close to zero. The multiplication of the

(im)probabilities of accidental resemblance, as more and more families are

considered, quickly assures the attentive taxonomist that similarities shared

by numerous families, often separated by vast distances, cannot be due to

chance. This crucial point has been emphasized by Collinder (1949), Green-

berg (1957, 1963, 1987), and Dolgopolsky (1964), among others, but even

Trombetti (1905) was well aware of the statistical importance of attestation

in multiple families, rather than in just two. The biologist Richard Dawkins

(1987: 274) makes the same point: “Convergent evolution is really a special

kind of coincidence. The thing about coincidences is that, even if they happen

once, they are far less likely to happen twice. And even less likely to happen

three times. By taking more and more separate protein molecules, we can all

but eliminate coincidence.”

To see just how unlikely accidental look-alikes really are, let us consider

two languages that each have just seven consonants and three vowels:

t

k

With a few notable exceptions the vast majority of the world’s languages show

at least these phonological distinctions. Yet even this minimal inventory is

capable of producing 147 CVC roots, as shown in Table 5. The probability

of accidental phonological identity is only 1/147, though the probability of

accidental phonological resemblance might be 2/147, 3/147, etc., depending on

how many other phonological shapes in Table 5 are deemed suﬃciently similar.

A perusal of Table 5 suggests, however, that most of these putative roots

are quite distinct phonologically and are not readily connected by common

phonological processes.

282

14. Global Etymologies

TABLE 5 Possible CVC Roots for a Language with Seven Consonants and Three

Vowels

KAK

LAK

MAK

NAK

PAK

SAK

TAK

KAL

LAL

MAL

NAL

PAL

SAL

TAL

KAM

LAM

MAM

NAM

PAM

SAM

TAM

KAN

LAN

MAN

NAN

PAN

SAN

TAN

KAP

LAP

MAP

NAP

PAP

SAP

TAP

KAS

LAS

MAS

NAS

PAS

SAS

TAS

KAT

LAT

MAT

NAT

PAT

SAT

TAT

KIK

LIK

MIK

NIK

PIK

SIK

TIK

KIL

LIL

MIL

NIL

PIL

SIL

TIL

KIM

LIM

MIM

NIM

PIM

SIM

TIM

KIN

LIN

MIN

NIN

SIN

TIN

KIP

LIP

MIP

NIP

PIP

SIP

TIP

KIS

LIS

MIS

NIS

PIS

SIS

TIS

KIT

LIT

MIT

NIT

PIT

SIT

TIT

KUK

LUK

MUK

NUK

PUK

SUK

TUK

KUL

LUL

MUL

NUL

PUL

SUL

TUL

KUM

LUM

MUM

NUM

PUM

SUM

TUM

KUN

LUN

MUN

NUN

PUN

SUN

TUN

KUP

LUP

MUP

NUP

PUP

SUP

TUP

KUS

LUS

MUS

NUS

PUS

SUS

TUS

KUT

LUT

MUT

NUT

PUT

SUT

TUT

Now were we to compare two languages with a more typical phonemic

inventory, say, fourteen consonants and ﬁve vowels,

t

k

ˇc

we would ﬁnd that the number of possible CVC roots in each language jumps

to 980. Again, of course, the probability of chance resemblance will depend

on certain phonological assumptions, but precious few accidental identities or

resemblances, vis-`

a-vis the stock of some other language or group of languages,

could be expected.

One may appreciate just how unlikely an explanation of chance resemblance

—independent development in each family—really is by considering the prob-

14. Global Etymologies

283

ability that the resemblances noted in etymology 21 (below) arose by conver-

gence. We have chosen this etymology for our argument because the meaning

involved is rarely borrowed and has no onomatopoeic connections. It thus

oﬀers a clear case, where the similarities must be due either to common origin

or to accidental convergence. Let us try to calculate the probability that these

similarities arose independently. To do this we must make certain assump-

tions, and at each such stage we shall adopt a minimalist approach that in

fact underestimates the true probability. Let us assume, as we did above, that

each language family uses only seven consonants and three vowels, yielding

the 147 syllable types shown in Table 5. What, then, is the probability that

two languages will accidentally match for a particular semantic/phonological

domain, in the present case ‘female genitalia’ ? Clearly it is 1/147 or .007.

Whatever the form that appears in the ﬁrst language family, the second fam-

ily has only one chance in 147 of matching it. And the probability that a

third family will oﬀer a match will be (1/147)

or .000049; that of a fourth

family, (1/147)

or .0000003; and so forth. In the etymology we give, 14 of

the 32 taxa show apparent cognates, though the evidence is for the moment

slim in Australian and the vowel in Austronesian (and many Amerind forms)

is e rather than the expected u. But if we ignore these details, then the prob-

ability that the particular sound/meaning correlation “PUT/female genitals”

arose independently fourteen times will be (1/147)

, or about one chance in

ten octillion, by our rough calculations. We feel that this qualiﬁes as a long

shot; certainly descent from a common source is the more likely explanation.

The foregoing constitutes what we consider to be the basis of genetic classi-

ﬁcation in linguistics. The application of these basic principles to the world’s

language families leads inevitably, in our opinion, to the conclusion that they

all derive from a single source, as suggested by the 27 etymologies presented

below. We have not yet dealt, however, with a number of other topics that

in the minds of many linguists are inextricably tied up with taxonomy, ques-

tions like reconstruction, sound correspondences, and the like. We believe that

these topics are not in fact of crucial importance in linguistic taxonomy, and

that mixing the basic taxonomic principles with these other factors has led

to much of the current confusion that we see concerning the classiﬁcation of

the world’s languages. So that these ancilary topics not be invoked yet again,

by those opposed to global comparisons, we will take them up one by one

and explain why they are not relevant to our enterprise. Let us begin with a

topic that is at the heart of many current disputes, the alleged incompatibility

between Greenberg’s method of multilateral comparison and the traditional

methods of comparative linguistics.

284

14. Global Etymologies

MULTILATERAL COMPARISON VS. THE TRADITIONAL METHOD

Many linguists feel that Greenberg’s use of what he calls multilateral com-

parison to classify languages in various parts of the world is incompatible

with—or even antagonistic to—the methods of traditional historical linguis-

tics, which emphasize reconstruction and sound correspondences (about which,

see below). Thus, Bynon (1977: 271) claims that “the use of basic vocabulary

comparison not simply as a preliminary to reconstruction but as a substitute

for it is more controversial. . . . Traditional historical linguists . . . have not

been slow in pointing out the inaccuracies which are bound to result from a

reliance on mere similarity of form assessed intuitively and unsubstantiated

by reconstruction.” In a similar vein, Anna Morpurgo Davies (1989: 167)

objects that “we do not yet know whether superfamilies outlined in this way

have the same properties as families established with the standard compara-

tive method. If they do not, there is a serious risk that the whole concept of

superfamily is vacuous.” And Derbyshire and Pullum (1991: 13) ﬁnd Green-

berg’s Amerind hypothesis “startling, to say the least, when judged in terms

of the standard methodology . . . .”

The confusion displayed in the previous three quotes (and one could give

many others) results from a failure to realize that the comparative method

consists essentially of two stages. The ﬁrst stage is classiﬁcation, which is re-

ally no diﬀerent from what Greenberg calls multilateral comparison. The sec-

ond stage, which might be called historical linguistics, involves family-internal

questions such as sound correspondences and reconstruction. In practice,

there is no name for this second stage simply because the two stages are seldom

distinguished in the basic handbooks on historical linguistics, in which, almost

without exception, the initial stage, classiﬁcation, is overlooked (Bynon 1977,

Hock 1986, Anttila 1989). Also overlooked in these basic texts are language

families other than Indo-European. The origin of this anomaly—which knows

no parallel in the biological world—is a consequence of the primogeniture

of Indo-European in the pantheon of identiﬁed families, and the subsequent

elaboration of the family by Europeans in the nineteenth century.

That the initial stage of comparative linguistics, classiﬁcation, is so system-

atically overlooked today lies in the origin of the Indo-European concept itself.

When Sir William Jones announced in 1786 that Sanskrit, Greek, and Latin—

and probably Gothic and Celtic as well—had all “sprung from some common

source,” he essentially resolved the ﬁrst stage of comparative linguistics at the

outset: he identiﬁed ﬁve branches of Indo-European and hypothesized that all

ﬁve were altered later forms of a single language that no longer existed. What

was left unstated in Jones’s historic formulation was the fact that languages

such as Arabic, Hebrew, and Turkish—languages that Jones knew well—were

Yüklə 356,86 Kb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8 9 10