particle verbs or other separable verbs, where the two elements do not form a
syntactic unit. In a broader typological perspective, the boundary between in-
flectional realization and the derivation of a distinct lexical item is not always
clear in languages that form nouns by joining a stem to a classifier or to a
noun class prefix, nor is it always self-evident whether what counts as a single
lexical item is the complex or the stem alone. Similar considerations apply to
so-called ideophones, free morphemes that in languages like Korean or
Japanese modulate a verb
’s lexical meaning (see Tsujimura 2014). Finally,
even superficially unremarkable complexes like adjective + noun can in fact
be sharply distinct for semantic purposes, between
“regular” modification
structures (like strong tea) and cases where the modifier has syntactically the
same status but in fact identifies a distinct entity (like green tea, which does
not denote the same substance as tea but rather a sub-kind). In all these
Models of lexical meaning
367
cases, the
“lexical items” whose content is available in the mental lexicon are
not revealed by simple inspection, but must be identified on the basis of em-
pirically motivated theoretical choices; see Svenonius (2008) for an example
that distinguishes various types of classifiers and modifiers on a structural
basis (for much broader typological studies of classifier structures, see Senft
2000 and Aikhenvald 2002).
As can be seen, the questions that arise when asking more precisely what
linguistic entity correlates with a semantically-identified
“lexical item”, are typ-
ically brought into focus by examining how morphology relates to semantics.
This should not surprise us, as morphology is centrally concerned with discrim-
inating on principled grounds between operations on lexical items and opera-
tions that create new ones. A morphosemantic perspective is therefore central
to the study of the mental lexicon.
The need to make explicit what exactly counts as a
“lexical item” for seman-
tics, and to do so in relation to morphology and syntax, leads therefore to a num-
ber of questions, which require precise theoretical choices. In fact, this brings
out the problematic nature of the notion of lexical item itself
– clearly a major
issue for the study of the mental lexicon. Positions vary greatly on this funda-
mental point, and this is not the place to attempt a review of them. Still, it is im-
portant to note that the opposition between
“lexicalist” and “non-lexicalist”
theories is about the lexicon as part of linguistic competence, not about the exis-
tence of (something like) the mental lexicon. If only lexicalist approaches envis-
age a lexicon as a distinct linguistic component, this does not mean that non-
lexicalist approaches (like Borer 2005a, b, 2013; or Harley 2012) give up on a se-
mantic notion of lexical item. On the contrary, they explicitly assert the existence
of such semantic listemes, but not as part of the linguistic knowledge that deter-
mines what words are and can be. This is different from claiming, for instance,
that the semantic side of a lexical item is an emergent notion, resulting from a
stable network of associations, and ultimately reducible to a set of uses. Words,
however defined or
“distributed”, have a semantic content which is not just an
epiphenomenon. This content either determines (for lexicalists) or is correlated
to (for non-lexicalists) a cluster of linguistic properties. From the former camp,
Levin (2011) makes this point explicit, as she distinguishes the mass of informa-
tion (stated or implied) associated with the use of a verb in context from the
semantic properties that are necessarily present across all uses of a verb, regard-
less of context; these alone constitute the verb
’s lexicalized meaning. A non-
lexicalist perspective likewise recognizes this cluster of linguistic properties,
but analyzes them in the same way as it analyzes non-listed linguistic objects
like phrases and sentences, viewing
“lexicality” as a matter of association
with knowledge of a different kind: about listed forms, about morphological
368
Paolo Acquaviva et al.
properties, and, crucially for the present purposes, about a conceptual con-
tent (this is obviously a simplified generalization; Fábregas and Scalise 2012
offer a more detailed overview, especially on pp. 4
–6; and Borer 2013 is the
most developed exposition of a non-lexicalist approach, with a detailed ac-
count of the relation between grammar and encyclopaedic content).
2.2.3 Variation in the empirical domain
For all approaches, the goal is to systematize as precisely as possible the context-
invariant information associated with lexical items (revolving around argument
structure and event structure for verbs, and countability and individuation for
nouns), and to do so in a way that can predict significant generalizations across
typologically different languages. The empirical domain of lexical semantic phe-
nomena to explain is vast, including for instance the role of verb Aktionsart on
deverbal nominalizations (see Alexiadou and Rathert 2010), restrictions on caus-
ative readings and on denominal verbalizations (like the impossibility of a read-
ing
“to make laugh” in *the clown laughed the children, and the fact that “they
put salt in the box
” can be expressed as they boxed the salt but not as *they salted
the box; Hale and Keyser 2002), crosslinguistically stable differences between the
morphological complexity of adjectives expressing basic states like loose and
event-derived states like broken (Koontz-Garboden 2005), the fact that simple
verbs can express the manner of an event, like swim, or its result, like clean, but
not both (Rappaport Hovav and Levin. 2010). A central place in this domain of
explananda is occupied by so-called
“lexicalization patterns” (the term from
Talmy 1985), typologically representative alternations in the way languages en-
capsulate information lexically.
Typology and the crosslinguistic dimension are a key aspect of this type of
investigation, and in this connection the contributions by Gennaro Chierchia
(Chierchia 1998, 2010) stand out. They propose a broad-ranging semantic parame-
trization of the interpretation of nouns across natural languages, as fundamen-
tally denoting kind-level entities or predicates. The analysis is couched in
rigorous formal semantic terms, but at the same time it has direct consequences
–
and predictions
– for morphology and syntax, correlating with important typolog-
ical properties such as the obligatoriness of classifiers or the presence of an inflec-
tional plural.
The debate inspired by these contributions has promoted a significant
advance in comparative lexical semantics (Chung 2000, Longobardi 2001,
Wilhelm 2008, to name only a few); in turn this has fruitfully interacted with syn-
tactic and morphological approaches (especially Borer 2005a, b, and much work
Models of lexical meaning
369
inspired by it) to provide a similar impulse on comparative research on count-
ability and individuation (see Massam 2012 and literature cited there). This is
clearly a strand of research that has a particular relevance for the study of the
mental lexicon, as it addresses on empirical bases the perennial question of the
tension between a presumably universal cognitive apparatus and the very di-
verse linguistic encapsulations of meaning.
2.2.4 Lexical knowledge and concepts
The study of the mental lexicon is where the theme of universality and crosslin-
guistic variation in lexical semantics intersects the question of semantics and
conceptual content. Most proposals about the decomposition of lexical items
have generally identified semantic content with conceptual content; the ex-
change between Fodor and Lepore (1999) and Hale and Keyser (1999) illustrates
some of the arguments, limited to one particular syntactic approach. However,
it is far from obvious that the structures posited by lexical decomposition
accounts (which are hypothesized as linguistic objects) should directly reflect
conceptual structure. A brief review will give an idea of the various positions
defended in the literature.
Some theorists have explicitly equated semantic and conceptual knowl-
edge; for instance Jackendoff (1990, 2002) analyzed the building blocks of lexi-
cal semantics as elements of a conceptual representation, so that primitives like
GO or TO are conceptual in nature and not strictly language-internal (even
though they are invoked to account for the linguistic properties of words). On
the other hand, the
“Two-Level Model” of Bierwisch and Schreuder (1992) (see
also Kaufmann 1995 and Wunderlich 1997) distinguish two distinct levels, a
conceptual one and a semantic one from which grammatically relevant aspects
of meaning are calculated. As shown in the useful critical discussion of Dölling
and Heyde-Zybatow (2007), a distinction between grammatically represented
information which is structurally represented, and
“pure” conceptual content
without grammatical relevance, is quite common, both in lexicalist accounts
(Rappaport Hovav and Levin 1998) and in non-lexicalist ones (Goldberg 1995;
Borer 2005a,b, 2013; Ramchand 2008). It is certainly understandable that lin-
guistic semantics should focus predominantly on the former dimension; how-
ever, this has arguably limited the contribution of lexical semantics to the
study of the mental lexicon. Consider the simple observation that languages dif-
fer in the way they cut up a range of perceptual experiences: Borer (2005a: 12)
notes that in English bees
“sting” but mosquitoes “bite”, like dogs and snakes;
by contrast, in Hebrew the attacks of bees and mosquitoes are described by the
370
Paolo Acquaviva et al.
same verb (
‘aqac), while those of dogs and snakes are described by two more
distinct verbs (na
šax and hikiš respectively). Surely, the different ranges of
applicability point to different boundaries in the
“conceptual content” of these
terms. But in Borer
’s words “it would be unfortunate to conclude from this that
Hebrew speakers live in a different conceptual (or, for that matter, physical)
world from that occupied by English speakers.
” (Borer 2005a: 12). If, say, BITE
1
and BITE
2
are distinct but commensurable (Borer suggests
“bundles of features,
plausibly hierarchically arranged
”), then their conceptual content must be elu-
cidated in a way that accounts for this (presumed) overlap, and makes clear what
empirical evidence can be brought to bear on the matter. Crucially, this would go
beyond a lexical semantic analysis. Just as crucially, though, it would relate
semantics to the psychological investigation of concepts; and this is needed to
avoid the unenlightening situation where a
“lexical concept” is defined as the
conceptual content of a lexical item, and a lexical item, circularly, as the linguis-
tic encapsulation of a concept (see Acquaviva and Panagiotidis 2012 for a critical
discussion).
The question of how lexical semantic explanation can be related to psycho-
logically plausible models of mental representation has indeed acquired a cer-
tain degree of urgency, as shown in the important contributions of Riemer
(2013, 2016); especially so since many psychological accounts of the representa-
tion of verbal meaning no longer support the classic notion of modality-
independent, discrete, stable
“concepts”. In order to contribute to a theory of
the mental lexicon, therefore, lexical semantics can no longer rely on some as-
sumed psychological notion of
“conceptual content”, but should itself strive to
validate its results in ways that are psychologically plausible.
An interesting development in this connection is represented by those inves-
tigations that seek to shed light on the psychological representation of polysemy.
Several studies (see Brown 2008 for a critical review, as well as Murphy 2007,
both cited by Rainer 2014) have attempted to establish on experimental grounds
whether the distinct senses that can be activated by a single form like paper (sub-
stance or daily publication) are stored, accessed, and represented as subentries
of a larger item, or rather as independent entries, as distinct from each other as
homonyms. Apart from their intrinsic importance as contributions to the under-
standing of the mental lexicon, such studies can be particularly useful in bridg-
ing the gap between the use of
“linguistic” analysis (using language-internal
evidence) and the use of psychological and neurological evidence; see in particu-
lar Pylkkänen, Llinás and Murphy (2006) in this connection.
Models of lexical meaning
371
2.3 The cognitive perspective
In this section we give a presentation of the foundational ideas of Cognitive
Linguistics and relate them to the views in Generativist and Structuralist
approaches. The section starts with the basic assumptions and proceeds to take
a closer look at some core lexical semantic concepts in the literature, and how
they are treated within this framework. As we have seen in Sections 2.1 and 2.2,
the assumptions differ across theoretical accounts. Cognitive Linguistics takes
a pragmatically enriched view of meaning modeling where natural language
use is of key importance (Cruse and Croft 2004, Paradis 2005, Fillmore 2006,
Goldberg 2006, Geeraerts 2010, Gärdenfors 2014). Lexical items do not have sta-
ble meanings, rather they evoke meanings when they are used in discourse.
Discursive meanings of lexical items are viewed as construals of specific mean-
ings in specific contexts (Paradis 2015). Meaning creation in context is both dy-
namic and constrained by encyclopaedic factors and conventionalization
patterns. The way people use and understand language is related to the world
around us. Language is dependent on our sensory and cognitive system, on the
one hand, and on our role as members of different cultures on the other. The
way we experience the world is decisive for how we understand it and how we
portray it in human communication. The focus of interest is different from the
symbolic approach in that researchers in this field take an interest in how lan-
guage is used in all its richness and in different contexts (for a comparison
between the generative and the cognitive commitments, see also Paradis 2003).
Language and concept formation has socio-psychological grounding. Category
membership is primarily a matter of more or less, rather than either-or, which
is an idea launched by Wittgenstein (1968). His notion of family resemblance
and gradience for membership of the category of game influenced prototype
theorists
’ work (Rosch 1973, 1975), sociolinguists such as Labov and subse-
quently the Cognitivist movement (Taylor 2003, for a discussion of gradience
and categoriality, see Aarts 2004).
According to the Cognitivist approach, meaning in language is encyclopae-
dic in the sense that there is no specific point along a linguistic-encyclopaedic
continuum where we can say that linguistic knowledge ends and encyclopaedic
knowledge starts. This does not mean that all aspects of meaning are consid-
ered to be of exactly the same type (Langacker 1987: 158
–161, Paradis 2003,
Croft and Cruse 2004). The major dividing line between the two is rather
whether it is at all possible to distinguish between linguistic knowledge and en-
cyclopaedic knowledge. The reason for this difference between the approaches
hinges on the stand for or against language as a separate module in the brain.
To illustrate the problems with the exclusion of encyclopedic lexical knowledge
372
Paolo Acquaviva et al.
for the description and motivations of meaning variability of lexical items,
Paradis (2003) gives examples of words in the English language, arguing that
knowing the meaning of open, fast and newspaper in different contexts always
involves knowing about the kinds of activities, properties and things involved.
In order to understand the meaning of open we need to know what kind of ac-
tivities we perform when we open things such as boxes, debates, pubs, com-
puters or books. Similarly, we need to know what entities can be fast and in
what way or whether newspaper refers to an artefact, a company or people
working there.
Language is considered to be shaped by the two main functions it serves:
the semiological function and the interactive function (Langacker 1998: 1). The
semiological function is the mapping of meanings (conceptualizations) to
linguistic forms in speech and writing. These structures are often referred to
as form-meaning pairings or constructions (Fillmore and Kay 1995; Goldberg
1995). The interactive function, on the other hand, concerns the communica-
tive side of language use as a social phenomenon including aspects such as
the function of providing information as well as expressing the speaker
’s sub-
jective stance and intersubjective awareness (Verhagen 2005, Gärdenfors
2014; Paradis 2015). Both the semiological and the interactive functions are
important for the guiding idea that language use must be explained with ref-
erence to the underlying mental processes as well as with reference to the so-
cial and situational context. At the core of the Cognitive approach is the
meaningful functioning of language in all its guises and all its uses in text
and discourse. It is a usage-based framework with two different applications,
one ontological and one methodological, both of which are central to the
framework. In the first application of the term usage-based, meanings of
words are acquired, develop and change through their use in social commu-
nication (Traugott and Dasher 2001, Tomasello 2003, 2008 Paradis 2008,
2011). The other application of the term usage-based refers to the fact that
naturally occurring text-based data are important as behavioral data sources
to gain insight into the nature of meaning in
“real” language use (Gonzalez-
Marquez et al. 2007).
The Cognitive approach to meaning does not only contrast to formal ap-
proaches, but also to the Structuralist approach which sees language as an
autonomous intralinguistic system of relations between lexical items, orga-
nized on the basis of lexical fields (Lehrer 1974, Cruse 1986). According to that
view, meanings of lexical items are not substantial, but relational and defined
in terms of what they are not. For instance, weak gets its meaning from its re-
lation to strong. Strong means what it does because it does not mean
“weak”.
Paradigmatic relations like these hold between lexical items which can
Models of lexical meaning
373
felicitously fill the same slot in an expression or a sentence (Lyons 1977). The
same applies to synonyms such as mother and mum in my mother is tall; my
mum is tall, or hyponyms such as horse and animal in the horse is in the
stables; the animal is in the stables. This paradigmatic approach to meaning
does not make much sense in the Cognitive framework, as we will see below.
There was however also another line of research within Structuralism within
which the scholars instead stressed the importance of the syntagm for lexical
meaning, i.e. linear relations formed between lexical items in a sentences Cruse
(1986: 16). Through these syntagmatic structuralist ideas and through the devel-
opment of machine-readable corpora, collocations and co-occurrence patterns
became important theoretical notions (Firth 1957, Sinclair 1987). The approach to
lexical meaning endorsed by the syntagmatic structuralists assumes that a
lexical item gets its meaning from the totality of its uses or, put differently, a
lexical item gets its meaning from the company it keeps in language use. In
this respect, the syntagmatic approach paved the way for new trends in lin-
guistics, namely for usage-based approaches to lexical semantics where con-
textual factors and real language in use are prime research objectives for the
description of meanings. This includes Cognitive Linguistics approaches and
computational approaches to lexical meaning (Pustejovsky 1995, Jackendoff
2002, Lenci and Sahlgren, to appear).
Following up on the notion of the syntagm within the Cognitive perspec-
tive, we point to the the contribution of lexical items to the syntagmatic con-
text at the level of sentence or utterance as well as the contribution of the
syntagmatic contexts to the interpretation of the lexical item. As concrete ex-
amples of topics and their treatments within Cognitive Linguistics, the notions
polysemy, homonymy, synonymy, hyperonymy and hyponymy and antonymy
and the relations they may form due to contextual factors at the syntagmatic
level are selected for a brief discussion. Like meanings in general, relational
variants are viewed as construals of meanings and may be grouped into three
main types.
– Polysemes are lexical items that have the same form. They evoke different
but related meanings in their syntagmatic strings. Homonyms also share
the same form, but their meanings are not at all related.
– Synonyms have different forms. They evoke meanings that are similar to
some degree but are instantiated in different domain matrices or frames.
Similarly, hyperonyms and hyponyms do not share forms but evoke related
meanings at different levels of generality, i.e. more general or less general.
– Antonyms have different forms. They evoke opposite properties of the
same meaning. Following Jones et al. (2012), the term is used as a cover
term for all different types of opposites in language.
374
Paolo Acquaviva et al.
Let us consider a pair of lexical items from the first category, where the items
share form but both differ and share aspects of meaning. Consider (1) from an
interview with Woody Allen.
1
(1)
As I
’ve said many times, rather than live on in the hearts and minds of my
fellow man, I would rather live on in my apartment [emphasis added].
(2)
The pen is mightier than the sword.
The two uses of live on in (1) are polysemes. The explanation for our interpre-
tation of the two expressions is that they share aspects of meaning but occur
in two different syntagmatic contexts and totally different meaning domains
support those contexts. The first use of live on is instantiated in a mental do-
main by in the hearts and minds of my fellow man, while the second use of live
on is couched in a concrete place, namely in my apartment. Polysemous lexi-
cal items such as live on are related by way of comparison, more precisely
through metaphorization. A state in the domain of apartment is compared to a
state in the mental domain. The two states share properties, but are instanti-
ated in different domains (e.g., Lakoff and Johnson 1980, Gibbs 1994, Giora
2003, Hanks and Giora 2011, Paradis 2015).
Pen and sword in (2) do not refer to the objects as such but to what these
objects are used for and to their users. The meanings are metonymically con-
strued through the affordances of the conceptual structure of
PEN
and
SWORD
respectively, that is, what they are used for and by whom. That part of the
conceptual structure is made salient through zooming in on the most relevant
aspect. The lexical items can be seen as shortcuts to the relevant areas in con-
ceptual space (Paradis 2004, Panther and Thornburg 2003, Benczes,
Barcelona and Ruiz de Mendoza Ibáñes 2011). If we regard them as construals
of usage, we are able to explain classical philosophical problems such as
whether a fake gun is a gun or as in (2) where pen and sword are both hypo-
nyms of weapon. In this context, mightier links pen and sword. The interpre-
tation of pen is metonymically related to how the pen is used and so is the
interpretation of sword (Paradis 2004). In this particular syntagm, neither is
used to refer to the artefacts per se but to their use that communication is a
more effective tool that violence or military force and thereby a hyponymic
relation is construed.
1 Paris Review. The art of humour no1 http://www.theparisreview.org/interviews/1550/the-art-
of-humor-no-1-woody-allen. (7 October 2015)
Models of lexical meaning
375
Both types of polysemes are motivated variants in the sense that they
evoke meanings which are related through a construal of comparison and re-
semblance (metaphorization), or through a contingent relation and a part-
whole construal of salience (metonymization) (Croft and Cruse 2004, Paradis
2004, 2005, 2015). In contrast, homonyms such as sole (the bottom part of a
shoe) and soul (the spirit) are arbitrary variants with the same form but with
unrelated meanings. Homonyms just happen to sound and/or look the same
in contemporary speech or writing.
Secondly, synonyms are lexical items that share core aspects of meaning,
but differ with respect to the patterning and ranking of the meaning domains
on the basis of which they are profiled.
(3)
They are rich/prosperous/loaded.
(4)
The twins are ambidextrous/both-handed.
In (3) rich/prosperous/loaded all refer to wealth, but in slightly different ways
and contexts, where rich is more neutral with respect to usage while prosper-
ous is formal and loaded is not. It is well-known that there are no absolute
synonyms in language use. There is a gradient of conceptual and communica-
tive similarity (Cruse 2011: 142
–145, Divjak 2010, Storjohann, 2010). From a
conceptual point of view synonymy can be described as the opposite of poly-
semy. Synonyms share core conceptual structures which are expressed
through different word forms. Metaphorical polysemes and homonyms, on
the other hand, are instantiated in different conceptual domains, under the
constraint of invariant configurations (Lakoff 1990, Paradis 2012, 2015), while
expressed by the same lexical item, and metonymical variants are instanti-
ated in the same domain. The conventional meaning of the lexical item and
the discursive meaning are in a part-whole relationship created through a
construal of salience that zooms in or zooms out (Paradis 2004).
Furthermore, hypernyms and hyponyms are also synonyms in the sense
that they share core meanings but differ with respect to specificity or general-
ity as in (5) and (6). Synonyms are construable as a bi-directional coupling, as
in if you are rich you are also prosperous or loaded, and if somebody is ambi-
dextrous he or she is also both-handed and vice versa. In the case of hyper-
nyms and hyponyms the bi-directionality does not hold. The meaning
construal is unidirectional as seen in (5) and (6).
376
Paolo Acquaviva et al.
(5)
Mumbling is talking but talking is not necessarily mumbling.
(6)
A dagger is a knife but a knife is not necessarily a dagger.
Finally, antonymy is a binary construal of opposition that holds between two
different lexical items in discourse. It is a relation of difference in similarity.
Antonyms always evoke opposite properties of one and the same conceptual di-
mension (Paradis and Willners 2011, Jones et al. 2012). For instance, good and
bad may be used as antonyms along the dimension of
MERIT
and good and evil
along the dimension of
BENEVOLENCE
. Interestingly, antonymic lexical items are
in fact used in the same semantic contexts also when they are not used to ex-
press opposition (Paradis et al. 2015). Contrary to what one may think in the
first place, this means that antonymy differs from synonymy in that it thrives
on similarity and the members form pairs along one dimension. Given long,
short comes to mind immediately. For this reason, the question
“What is the
opposite of X?
” is easy to answer, while it is hard to find an answer to “What is
the synonym of X?
”. In contrast to the other relations, antonymy is a truly fun-
damental relation in the sense that it appears to be the most readily appre-
hended by speakers of a language.
Contrast in perception and bipolar organization in cognition are the
underpinnings of antonymy in language. Most speakers have strong intuitions
about how antonyms are used and that some antonyms are perceived to be
better exemplars than others. Research using different observational techni-
ques has established that there are a number of opposable word pairs that
have special status as canonical antonyms (Murphy et al. 2009; Paradis et al.
2009, Paradis and Willners 2011, van de Weijer et al. 2012, van de Weijer et al.
2014). The strength of antonym couplings is determined by factors such as the
degree of conventionalization as form-meaning pairs in discourse, the degree
of entrenchment as antonymous words in memory, and the salience of the di-
mensional domain they express, e.g.
LUMINOSITY
dark-light,
STRENGTH
weak-
strong,
SIZE
small-large,
WIDTH
narrow-wide. It has been argued that it is the
meaning dimension that is the cause of the strength of the lexical relation
rather than the effect of the high frequency of these words in language
(Murphy and Andrew 1993; van de Weijer et al. 2012). The contentful meaning
structures, e.g.
LUMINOSITY
or
STRENGTH
, of the dimensions that form the base
of canonical antonyms, coincide with the core of semantic types that are cen-
tral to all human activities, as noted by Dixon (2009).
Models of lexical meaning
377
2.4 The distributional perspective
Distributional semantics is a rich family of computational models assuming that
the statistical distribution of words in linguistic context plays a key role in char-
acterizing their semantic behavior. The theoretical foundation of distributional
semantics is what has become known as the Distributional Hypothesis: Lexemes
with similar distributional properties have similar meanings. Distributional seman-
tics has been attracting a growing interest especially in the last twenty years, but
its roots are much older. They lie in linguistic and philosophical traditions that,
despite being substantially different, share the common assumption that the
meaning of words must be described by looking at how they are used in language.
Zellig Harris is usually referred to as the theoretical and methodological
source of research in distributional semantics:
“If we consider words or mor-
phemes A and B to be more different in meaning than A and C, then we will
often find that the distributions of A and B are more different than the distribu-
tions of A and C. In other words, difference of meaning correlates with differ-
ence of distribution.
” (Harris, 1954: 156). In his later works, Harris characterizes
linguistic distributions in terms of syntactic dependencies involving relations
between a word acting as operator and a word acting as its argument. The
“selection” (that is, the distribution) of a word is the set of operators and argu-
ments with which it co-occurs with a statistically significant frequency, and is
strongly correlated to its meaning. According to Harris, meaning
“is a concept
of no clear definition
” (Harris 1991: 321), but distributional analysis can turn it
into a measurable, objective and therefore, scientific notion:
“Selection is ob-
jectively investigable and explicitly statable and subdividable in a way that is
not possible for meanings
– whether as extension and referents or as sense and
definition.
” (Harris 1991: 329). The goal of Harris’ distributional programme is
therefore not to exclude meaning from the study of language, but rather to pro-
vide a scientific foundation for its investigation.
Distributional semantics is a direct product of American structuralism, but it
is also strongly indebted to European structural linguistics. The (semantic) rela-
tion between two words or morphemes is defined differentially, based on their
distributional behavior. Like for De Saussure, words have meaning only within a
linguistic system, in which they are used and entertain various relations with
other expressions. Jakobson (1959) calls the knowledge of such relations
“linguis-
tic acquaintance
”, whose importance supersedes the role of the “direct acquain-
tance
” with the entities words refer to. The latter may even be lacking (for
instance, we can use ambrosia correctly even without direct experience of its
referent), while linguistic acquaintance is an essential condition to understand
the meaning of any lexeme. Structural semantics proceeded independently from
378
Paolo Acquaviva et al.
distributionalism, but the latter was often adopted as a method to define para-
digms in terms of syntagmatic relations. The Distributional Hypothesis can in-
deed be reformulated in stricter structuralist terms (Sahlgren 2006): Lexemes that
share syntagmatic contexts have similar paradigmatic properties. For instance,
Apresjan (1966) referred to Harris
’ distributional methodology as a way to pro-
vide more objectivity to the investigation of semantic fields by grounding it on
linguistic evidence. Apresjan carried out a distributional analysis of adjectives in
terms of their frequency of co-occurrence with various syntactic contexts. The in-
terplay between syntagmatic and paradigmatic dimensions is also central for
Cruse (1986): The greater the paradigmatic
“affinity” of lexical items, the more
congruent their patterns of syntagmatic relations.
The idea that distributional analysis is the key to understand word meaning
has also flourished within the linguistic tradition stemming from John Firth. In
fact, corpus linguistics represents another important root of distributional seman-
tics. Firth
’s contextual theory of meaning was based on the assumption that mean-
ing is a very complex, and multifaceted reality, inherently related to language use
in contexts (e.g., social setting, discourse, etc.). One of the key
“modes” of mean-
ing of a word is what he calls
“meaning by collocation” (Firth 1951), determined
by the context of surrounding words. The study of collocations has kept on grow-
ing as an independent line of research, but its theoretical assumptions and meth-
ods are deeply intertwined with distributional semantics. Finally, another crucial
philosophical reference for distributional semantics is represented by the usage-
based view of meaning developed by Ludwig Wittgenstein in his later writings. In
the Philosophical Investigations, Wittgenstein urges us not to assume a gen-
eral and fixed meaning of words. Instead, we should look at how the words
are being used, because
“the meaning of a word is its use in the language.”
(Wittgenstein 1953).
2.4.1 Distributional semantic models
The Distributional Hypothesis is a general assumption on the relationship be-
tween meaning and linguistic distributions, and states that the semantic simi-
larity of lexical items is a function of their distribution in linguistic contexts.
Distributional Semantic Models are computational methods that turn this hy-
pothesis into a scientific framework for semantic analysis. Distributional
Semantic Models are also commonly referred to as word space models, semantic
space models, (semantic/distributional) vector space models, geometrical (se-
mantic) models, context-theoretic semantic models, statistical semantic models
or corpus-based semantic models. These names emphasize different aspects of
Models of lexical meaning
379
the way Distributional Semantic Models learn and represent the semantic con-
tent of lexical items. Distributional Semantic Models form a vast multifarious
family of computational methods often developed within very different re-
search traditions and for diverse purposes (e.g., information retrieval, natural
language processing, cognitive modeling), but they all share the following prin-
ciples: words are represented as vectors built from their distribution in the con-
texts extracted from corpora, and similarity between words is approximated in
terms of geometric distance between their vectors.
The standard organization of Distributional Semantic Models is usually de-
scribed as a four-step method (Turney and Pantel 2010):
1.
for each target word, contexts are first collected from a (usually large) cor-
pus and counted to build a co-occurrence matrix. The matrix rows corre-
spond to the target lexemes and its columns to the contexts;
2.
raw frequencies are then transformed into significance scores (e.g., positive
pointwise mutual information) that are more suitable to reflect the impor-
tance of the contexts to characterize the target lexemes;
3.
the resulting matrix tends to be very large and sparse, requiring techniques
to limit the number of dimensions, such as Singular Value Decomposition
or Principal Component Analysis.
4.
finally, a similarity score is computed between the vector rows, using vari-
ous vector similarity measures, the most common one being the cosine.
Distributional Semantic Models have many design options, due to the variety of
parameters that can be set up at each step of the process and may affect the
results and performances of the system. The definition of context is surely a
crucial parameter in the implementation of the models. Three types of linguistic
environments have been considered: in document-based models, as in Latent
Semantic Analysis (Landauer and Dumais, 1997), words are similar if they ap-
pear in the same documents or in the same paragraphs; word-based models
consider a linear window of collocates around the target words (Lund and
Burgess, 1996; Sahlgren, 2006); syntax-based models are closer to Harris
’ ap-
proach as they compare words on the basis of their dependency relations
(Curran, 2003; Padó and Lapata, 2007; Baroni and Lenci, 2010). Word-based
models have an additional parameter represented by the window size (from a
few words to an entire paragraph), while syntax-based models need to specify
the type of dependency relations that are selected as contexts. Some experi-
ments suggest that syntax-based models tend to identify distributional neigh-
bors that are taxonomically related, mainly co-hyponyms, whereas word-based
models are more oriented towards identifying associative relations (Van de
Cruys, 2008; Peirsman et al., 2007; Levy and Goldberg, 2014). However, the
380
Paolo Acquaviva et al.
question whether syntactic contexts provide a real advantage over linear mod-
els is still open. On the other hand, a more dramatic difference exists with re-
spect to document-based models, which are strongly oriented towards
neighbors belonging to loosely defined semantic topics or domains (Sahlgren,
2006).
Recently, a new family of Distributional Semantic Models have emerged,
which take a radically different approach to learn distributional vectors. They are
based on neural network algorithms and are called predict models, because, in-
stead of building a co-occurrence matrix by counting word distributions in
corpora, they directly create low-dimensional distributional representations by
learning to optimally predict the contexts of a target word. These representations
are also commonly referred to as (neural) word embeddings. The most popular
neural Distributional Semantic Model is the one implemented in the word2vec li-
brary (Mikolov et al. 2013).
Because of its history and different roots, distributional semantics is a mani-
fold program for semantic analysis, which is pursued in disciplines as different as
computational linguistics and psychology. The goals of Distributional Semantic
Models are equally various: thesaurus construction, word-sense disambiguation,
cognitively plausible models for language acquisition and processing, etc. Within
this broad range of applications, we can distinguish between a weak and a strong
version of the Distributional Hypothesis (Lenci 2008).
The Weak Distributional Hypothesis is essentially a method for semantic
analysis. The starting assumption is that lexical meaning (whatever this might
be) determines the distribution of words in contexts, and the semantic proper-
ties of lexical items act as constraints governing their syntagmatic behavior.
Consequently, by inspecting a relevant number of distributional contexts, we
can identify those aspects of meaning that are shared by lexemes with similar
linguistic distributions. The Weak Distributional Hypothesis assumes the exis-
tence of a correlation between semantic content and linguistic distributions,
and exploits such correlation to investigate the semantic behavior of lexical
items. It does not entail that word distributions are themselves constitutive of
the semantic properties of lexical items at a cognitive level, but rather that
meaning is a kind of
“hidden variable” responsible for the distributions we ob-
serve, which we try to uncover by analyzing such distributions.
The Strong Distributional Hypothesis is instead a cognitive assumption
about the form and origin of semantic representations. Repeated encounters
with lexemes in language use eventually lead to the formation of a distribu-
tional representation as an abstract characterization of the most significant
contexts with which the word co-occurs. Crucially, the Strong Distributional
Hypothesis entails that word distributions in context have a specific causal role
Models of lexical meaning
381
in the formation of the semantic representation for that word. Under this ver-
sion, the distributional behavior of a lexeme is not just regarded as a way to
describe its semantic properties, but rather as a way to explain them at the cog-
nitive level.
The strong and weak versions of the Distributional Hypothesis set very differ-
ent constraints and goals for computational models. Most of the Distributional
Semantic Models in computational linguistics usually content themselves with
the weak version, and conceive of distributional semantics as a method to endow
natural language processing systems with semantic information automatically
acquired from corpora. On the other hand, Distributional Semantic Models in
cognitive research confront themselves with the potentialities as well the prob-
lems raised by the Strong Distributional Hypothesis, which must therefore face
the tribunal of the cognitive evidence about semantic representations. In any
case, the success of the Distributional Hypothesis, either as a descriptive method
for semantic analysis or as an explanatory model of meaning, must be evaluated
on the grounds of the semantic facts that it is actually able to explain.
2.4.2 Distributional representations as semantic representations
The main characters of distributional semantics can be summarized as follows:
– the theoretical foundation of distributional semantics is the Distributional
Hypothesis. This is primarily a conjecture about semantic similarity, which
is modeled as a function of distributional similarity: semantic similarity is
therefore the core notion of distributional semantics;
– the Distributional Hypothesis is primarily a conjecture about lexical mean-
ing, so that the main focus of distributional semantics is on the lexicon,
specifically on content words (i.e., nouns, verbs, adjectives, and adverbs);
– distributional semantics is based on a holistic and relational view of mean-
ing. The content of lexical items is defined in terms of their (dis)similarity
with other lexemes;
– distributional semantics is based on a contextual and usage-based view of
meaning. The content of lexical items is determined by their use in contexts.
– the Distributional Hypothesis is implemented by Distributional Semantic
Models These are computational methods that learn distributional represen-
tations of lexical item from corpus data. The distributional representation of
a lexeme is a distributional vector recording its statistical distribution in lin-
guistic contexts;
– semantic similarity is measured with distributional vector similarity.
382
Paolo Acquaviva et al.
What are then the main features of distributional vectors as semantic represen-
tations? How do they differ from other types of representations of lexical mean-
ing? As noted above, distributional semantics is strictly and naturally related to
the structuralist view of meaning. This follows not only from the history itself
of distributional semantics, but also from its relational view of meaning. Like
structuralist approaches, distributional semantics considers the meaning of a
lexical item as dependent on its relations with the other lexemes in the seman-
tic space. A close
“family resemblance” also exists with cognitive models, with
which distributional semantics share a usage-based view of meaning.
Stronger differences instead divide distributional semantics from ap-
proaches to meaning adopting semantic representations in terms of symbolic
structures. In symbolic models, lexical items are mapped onto formal structures
of symbols that represent and make explicit their semantic properties. What
varies is the formal metalanguage used to build semantic representations (for
example, networks, frames, semantic features, recursive feature structures, and
so on). Symbolic semantic representations are qualitative, discrete, and categor-
ical. Semantic explanations only refer to the structure of semantic symbols with
which lexical meanings are represented. For instance, in a semantic network
like WordNet (Fellbaum 1998), the hypernym hierarchy of car explains that
John bought a car entails that John bought a vehicle. Semantic similarity is also
defined over the lexical symbolic structures, for instance by measuring the
overlap between feature lists (Tversky 1977) or the distance in semantic net-
works (Budanitsky and Hirst 2006).
The characters of distributional semantics also make it quite different from
theories of meaning that are not grounded on the Distributional Hypothesis,
most notably formal (model-theoretic) semantics. Formal semantics is itself a
rich and variegated family of semantic models that share a referential (denota-
tional) view of meaning, based on the assumption that meaning is essentially a
relation between the symbols of languages and entities external to language,
and that the goal of semantics is to characterize the truth conditions of senten-
ces as a function of the reference (denotation) of their parts. In fact, the core
notions of Frege
’s programme for formal semantics – truth, reference, and logi-
cal form
– are as different as possible from those of Harris’ program for distribu-
tional semantics
– linguistic contexts, use, and distributional vectors. The
distance between these two semantic paradigms can be best appreciated by
considering the contrast between their main philosophical references: the early
Wittgenstein of the Tractatus Logico-Philosophicus (Wittgenstein 1922) for for-
mal semantics, and the later Wittgenstein of the Philosophical Investigations for
distributional semantics. Therefore, it is no surprise that formal and distribu-
tional semantics, as the heirs of these two radically different views on meaning,
Models of lexical meaning
383
have proceeded virtually ignoring each other, focussing on totally different se-
mantic phenomena. As a matter of fact, a whole range of issues in the agenda
of formal semantics, such as semantic compositionality, quantification, infer-
ence, anaphora, modality, or tense, have remained beyond the main horizon of
distributional semantics.
Distributional vectors are very different from semantic representations adopted
in symbolic and formal models of meaning. Distributional representations are
quantitative, continuous, gradable and distributed. These properties directly stem
from the fact that distributional representations are not symbolic structures, but
real-valued vectors. Quantitative and gradable semantic representations are com-
monly adopted in cognitive science to account for key properties of concepts such
as graded category membership, typicality and vagueness (Hampton 2007).
Concepts are thus represented with vectors of features, weighted according to their
importance for a concept (Smith and Medin 1981, McRae et al. 1997). Vector dimen-
sions are typically derived from semantic feature norms (McRae et al. 2005a),
which are collected by asking native speakers to generate properties they consider
important to describe the meaning of a word. The number of subjects that listed a
certain feature for a concept is then used as feature weight.
The quantitative and gradable character of distributional representations
makes them very similar to the way information is represented in artificial neural
networks. Connectionist models use non-symbolic distributed representations
formed by real-valued vectors such that
“each entity is represented by a pattern
of activity distributed over many computing elements, and each computing ele-
ment is involved in representing many different entities
” (Hinton et al. 1986: 77).
Distributional representations are also distributed because the semantic proper-
ties of lexical items emerge from comparisons between their n-dimensional vec-
tors, for example by measuring their similarity in distributional vector space. The
semantic content of a word therefore lies in its global distributional history
encoded in the vector, rather than in some specific set of semantic features or
relations. Neural networks are general algorithms that encode information
with vectors of neural unit activations and learn high-order representations
from co-occurrence statistics across stimulus events in the environment.
Connectionism is fully consistent with the distributional hypothesis, since lin-
guistic co-occurrences are just a particular type of stimuli that can be learnt by
neural networks. A natural convergence thus exists between research on neural
networks and distributional semantics. In distributional approaches to meaning,
lexical representations emerge from co-occurrences with linguistic contexts.
Moreover, distributional semantic spaces are built with computational models
–
including neural networks
– that use domain-independent learning algorithms
recording the distributional statistics in the linguistic input. Nowadays, neural
384
Paolo Acquaviva et al.
networks in fact represent one particular family of computational models in
distributional semantics (cf. Section 2.4.1).
The notions of distributed and distributional representations are closely
related but need to be kept well distinguished. In fact, the former concerns the
way semantic information is represented with vectors, while the latter concerns
the source of the information used to build the vectors. The term
“distributional”
specifically refers to the property of vectors to encode the statistical distribution of
lexemes in linguistic contexts. All distributional representations are distributed, but
not all distributed representations are distributional. It is indeed possible to repre-
sent words with distributed semantic representations that are not distributional.
Vector space representations of meaning are in fact common in cognitive science
(Markman 1999). Osgood (1952) and Osgood, Suci and Tannenbaum (1957) are
among the first models of concepts in terms of n-dimensional semantic spaces.
However, the dimensions of Osgood
’s semantic spaces are not distributional, but
are built according to the method of
“semantic differential”: subjects are asked to
locate the meaning of a word along different scales between two polar adjectives
(e.g., happy
– sad, slow – fast, hard – soft, etc.), and their ratings are used to de-
termine its position in the semantic space, which mainly capture connotative as-
pects of meaning. Rogers and McClelland (2004) use a neural network to learn
distributed representations with vector dimensions encoding specific semantic
properties (e.g., has_wings, flies, is_a_plant, etc.), and computational simulations
with distributed representations derived from feature norms are proposed by Cree,
McRae and McNorgan (1999) and Vigliocco (2004). Gärdenfors (2000, 2014) repre-
sents concepts and lexical meanings with regions in
“conceptual spaces”. These
are defined as vector spaces whose dimensions are
“qualities” of objects, corre-
sponding to the different ways stimuli are judged to be similar or different, such
as weight, temperature, height, etc. In Gärdenfors
’ classical example, colors are
characterized by a three-dimensional vector space defined by hue, brightness, and
saturation. The meaning of a color term like red is then identified with a region in
this color space, and color similarities are defined via the distance of the corre-
sponding regions in space. The geometrical representation of concepts proposed
by Gärdenfors indeed closely resembles vector-based representations adopted in
distributional semantics, but the dimensions of conceptual spaces correspond to
attributes of objects, rather than to linguistic contexts.
Models of lexical meaning
385
3 Empirical challenges: Two illustrations
In the introduction, we identified five questions that are crucial for all treat-
ments of meaning in language. In relation to lexical semantics, they concern
the nature of lexical meaning, what the relation between words and their mean-
ings is, how meanings are learned, stored, communicated and understood, and
how they change. Section 2 has broadly followed these as guiding questions in
reviewing and comparing the main approaches to lexical semantics. In this con-
cluding section, we will invert the perspective and consider two specific empiri-
cal domains and the challenges that they pose, namely color terms and
semantic flexibility in context. The two are viewed from different perspectives,
which foreground respectively the need for extensive and carefully constructed
data sets, and the need for a clear delineation (theoretical as well as empirical)
of what counts as
“lexical item” and how it is modeled, for any data-driven con-
clusion about the mental lexicon and generally about the role of language in
cognition.
3.1 Color terms
What all approaches have in common is the need for empirically-based obser-
vation in one form or another, be they textual or experimental. In much of to-
day
’s research on lexical meaning we often see a combination of methods
facilitated by the rapid development of technical knowledge when it comes to
theoretical computational advances as well as when it comes to technical
equipment for data storage and analysis. For all approaches, we also see the
need for proper integration with mental processes related to the cognitive sys-
tem (categorization and reasoning), to the perceptive and affective systems,
and to the role of communication, that is, how human beings make use of lan-
guage to make themselves understood and to obtain responses to what they
are saying.
The color domain has been one of the most investigated, as well as one of
the most
“popular” areas within the realm of lexical semantics. The study that
changed the investigation of color terms is the famous Berlin and Kay
’s (1969)
study of color terms in various languages. It has become a backbone for other
types of research of color terms and it has been further developed since its publi-
cation. As pointed out by Majid, Jordan and Dunn (2015), the methodology given
by Berlin and Kay was refined in the World Color Survey (Kay et al., 2009)
– the
largest ever empirical study of semantics, featuring 110 languages spoken primar-
ily by small-scale, preliterate, non-industrialized communities. The World Color
386
Paolo Acquaviva et al.
Survey enabled researchers to show cross-linguistic differences in color naming
that reflect cognitive principles and to point to differences in boundaries that lan-
guages impose onto the color spectrum. As emphasized by Majid, Jordan and
Dunn (2015), Berlin and Kay
’s work has been an inspiration for many types of
research, but it has been also criticized for over-sampling or under-sampling
Indo-European color terms. Research on color terms was conducted with regard
to some Indo-European sub-families, like Slavic (Comrie and Corbett 1993), but
no large-scale investigation has been undertaken. Therefore, there was room for
a more integrative study that would take into account data from a large number
of Indo-European languages. Such an endeavor was a project called Evolution of
Semantic Systems (EOSS). The project was conducted at the Max Planck Institute
for Psycholinguistics (Nijmegen) from 2011 to 2014 and included research on 50
Indo-European languages. The project was grounded on linguistic, psychological
and anthropological theoretical frameworks. One of the basic goals of the project
was to investigate color terms speaker use in the partition of the color spectrum.
Research on color terms within the EOSS project consisted of several different tri-
als with adult participants. The empirically-based results from the project enabled
further investigation of lexicalization patterns speakers use in color naming and
thus conveying different meanings. First, it enabled a cross-linguistic analysis of
genetically related languages. For example, a cross-linguistic analysis of lexicali-
zation patterns used in color naming in Croatian, Polish and Czech (Raffaelli,
Kopecka, Chromý, 2019) showed a high degree of correlation between word-
formation processes and the meanings that are conveyed by particular color
terms. Thus, for example all the three languages use suffixes like -kast (Croatian),
-aw (Polish) or -av (Czech) to convey the meaning
‘ish’ like zelenkast ‘greenish’,
or -ast (Croatian), -ow (Polish) or -ov (Czech) with the meaning
‘N-like’ like
naran
čast ‘orange-like’. However, Polish and Czech have some additional suffixes
with meanings that do not appear in Croatian like -sk-/-ck-
‘typical of’ (Czech) or
-n-
‘made of’ (Polish). Second, results from psycholinguistic research (based on
the frequency data of the terms used in the partition of the color spectrum) en-
abled comparison to the data collected via other empirically-based methods. For
example, the EOSS data for Croatian were compared to the frequency data from
the Croatian n-gram system (based on the Web as Corpus approach) consisting of
1.72 billion tokens (Dembitz et al., 2014). The 165 different Croatian color terms
(types) from the EOSS project were checked in the Croatian n-gram system in
order to provide evidence about their attestation in a large language resource.
Moreover, the combination of two different methods shed light on the correlation
between the strategies speakers use in color-naming, and the degree of con-
ventionalization based on the corpus data. The frequency data from the Croatian
n-gram system show that basic color terms are significantly the most frequent
Models of lexical meaning
387
ones, and are thus highly conventionalized. The data also show that com-
pounding is a more pervasive process in the formation of color terms in
Croatian than derivation (which is usually more productive in the formation
of new lexemes). This means that compounding allows for a more fine-
grained naming of the color spectrum and allows for greater creativity in
color naming than derivation does. There is also a high degree of frequency
correlation between the most frequent compound terms in the two data sets.
The compound zeleno-plava
‘green-blue’ and plavo-zelena ‘blue-green’ are
the most frequent compound terms. These terms cover the central and the
largest part of the color spectrum (typical for all the Indo-European lan-
guages) and according to the corpus data refer to phenomena in nature like,
see, water, different types of plants, etc. The combination of the two methods
also showed the continuum of more and less conventionalized terms and
their cognitive entrenchment. Terms less frequently used by speakers in the
process of color naming are also the less frequent terms in the corpus. The
combination of the two empirically based methods could have impact on fu-
ture research of the correlation between perception and cognition as univer-
sal human capacities and the constraints imposed by cultural differences and
typological differences of languages on the formation of lexical items.
Interesting evidence on the interplay between language and perception
comes from the study of congenital blind subjects, who show a close similarity
with sighted subjects in the use and understanding of color terms. In a multidi-
mensional scaling analysis performed by Marmor (1978) with similarity judg-
ments about color terms, the similarity space of the congenital blind subjects
closely approximates Newton
’s color wheel and the judgments by sighted con-
trol participants. Therefore, she concludes that knowledge of color relations
can be acquired without first-hand sensory experience. The congenital blind
child studied by Landau and Gleitman (1985), Kelli, was indeed able to acquire
impressive knowledge about color terms, including the constraints governing
their correct application to concrete nouns, without overextending them to ab-
stract or event nouns. The common interpretation of these data is that congeni-
tal blind people possess substantial knowledge about the visual world derived
through linguistic input. Language-derived information either comes in the
form of
“supervised” verbal instructions (e.g., teaching that cherries are red) or
in the form of
“unsupervised” distributional analysis of linguistic contexts.
Language, in fact, contains expressions such as yellow banana or red cherry
that can be used to learn information about color-noun associations, as well as
the general constraints concerning the applicability of color adjectives or visual
verbs only to particular noun classes.
388
Paolo Acquaviva et al.
On the other hand, the similarities between color spaces in congenital and
blind subjects are not fully uncontroversial. For instance, Shepard and Cooper
(1992) find important differences between the color spaces of sighted and con-
genital blind subjects, differently from Marmor (1978). Connolly et al. (2007)
also show that the lack of visual experience of colors indeed has significant ef-
fects on the conceptual organization in blind subjects. They collect implicit sim-
ilarity judgments in an odd-man-out task about two categories of concepts,
“fruits and vegetables” and “household items”. Cluster analysis of the similar-
ity judgments reveals a major overlap between the blind and sighted similarity
spaces, but significant differences for clusters of the
“fruit and vegetables” cat-
egory for which color is a
“diagnostic” property (i.e., critical to identify the
exemplars of that category, such as being yellow for a banana). Even for blind
subjects with good knowledge of the stimulus color, this is not relevant to orga-
nize the similarity space. The hypothesis by Connolly et al. is that such contrast
stems from the different origin of color knowledge in the two groups. In the
congenital blind, color knowledge is
“merely stipulated”, because it comes
from observing the way color terms are used in everyday speech, while in the
sighted it is an immediate form of knowledge derived from direct sensory expe-
rience, and used to categorize new exemplars. Similar differences have been
found in the feature norming study by Lenci et al. (2013): Congenital blind sub-
jects in fact produced significantly less color terms when describing concrete
objects than sighted control subjects (Lenci, Baroni and Cazzolli 2013). These
contrasting pieces of evidence show that, on the one hand, distributional infor-
mation is rich enough to allow the organization of the color space to be derived
from the linguistic input, while on the other hand the lack of direct perceptual
experience may result in critical differences in the role of and use of color
information.
The role of linguistic and perceptual information as sources of semantic re-
presentation is still a puzzle with many missing pieces. New technologies that
enable new experiments, precisely calculated results and data collected via dif-
ferent methods should be considered as the methodological backbone of con-
temporary research in lexical semantics, and as the only way to fill these gaps.
Experientially-based approaches to lexical semantics can provide evidence
about how word meanings are construed, to what extent they are conventional-
ized and how much they are influenced by perception and cognition or by cul-
tural diversity and different typological properties. The examples given above
are just an illustration of an attempt to integrate traditional and theoretically
well elaborated topics with empirically-based methods.
Models of lexical meaning
389
3.2 Coercion and semantic flexibility in context
It is a simple fact that words assume different meanings in different contexts. If
this plasticity had no bounds, any word could mean anything, given an appro-
priate context. Since that is not the case, a notion of lexical content distinct
from that determined by context of use is justified; but it is a content that is at
least partly shaped by its context. For this reason, investigating the boundaries
of context-determined flexibility is and has been a central task of research in
lexical semantics (see already Cruse 1986). This traditional topic, extensively
addressed in structural and cognitive approaches, acquires a particular promi-
nence also in
“formal” models with the advent of analyses that decompose lexi-
cal items into complex formal structures (syntactic or otherwise). In rough
general terms, if lexical content is modeled as a linguistically represented struc-
ture, embedded in a larger structure, the question of what constrains lexical
semantic flexibility in context is resolved into the question of how lexical mean-
ing can and cannot be structurally decomposed. Among the large number of
phenomena and competing approaches, we can concentrate here specifically
on the phenomenon of coercion, whereby a context enforces an interpretation
on a lexical item that lacks it in any other contexts. The typical illustrations in-
volve entity-denoting nominals coerced into a different interpretation by predi-
cates that take eventualities as arguments:
(7)
a. Syd grabbed the book / cigar / bottle of wine.
b. Syd enjoyed the book / cigar / bottle of wine.
Asher (2011: 16) observes that what drives this adaptation cannot be the seman-
tics of the nominal object, because the same effect is replicated when this is a
non-existent word like zibzab:
(8)
Syd enjoyed the zibzab.
Not every predicate can freely impose its requirements, however. Still following
Asher (2011: 215), we can observe that the modifier slow qualifies a processual
notion licensed by the head noun in a slow person (
“slow in understanding”) or
a slow animal (
“slow in moving”), but not in the semantically anomalous a
slow tree, although world knowledge could in principle license the reading
“a
slow-growing tree
”. Likewise, we can enjoy an apple or finish an apple, but not
really end an apple; and the previous mention of a relevant discourse entity al-
lows us to interpret start with the kitchen as
‘start painting the kitchen’ in (9b),
but not in (9c) (adapted from Asher 2011: 19
–20):
390
Paolo Acquaviva et al.
(9)
a. ? Yesterday, Sabrina started with the kitchen.
b. Yesterday, Sabrina painted her house. She started with the kitchen.
c. ? Last month, Sabrina painted her cousin
’s house. Then today, she
started with the kitchen.
Positing articulated structures for the content of lexical items with different prop-
erties (like end and finish), and providing explicit constraints on how these mean-
ings combine in context, is one way to approach these phenomena. By this
move,
“coercion is not really a problem about meaning change in the lexicon;
it
’s a problem about compositionality – about how lexically given meanings com-
bine together in the right sort of way
” (Asher 2011: 18). This aspect assumes par-
ticular prominence in syntactic decomposition approaches, which analyze
lexical content in terms of the same types of formal objects (structures, primi-
tives, combinatorial principles) as those that define linguistic contexts. Crucially,
when decompositional analyses are sufficiently precise, their empirical value can
be compared across different models and frameworks. Asher (2011: 252
–255)
presents some empirical arguments against the generalized use of abstract verbs
for
“locative” or “possessive” functions (Harley 2004, Cable 2011, among others),
but he also notes that structures like want a beer effectively seem to motivate one
extra verbal predicate represented in the syntactic structure, not just as a lexical
inference; this is what licenses rapidly in (10a) but not (10b), as a modifier of an
abstract
“have” predicate in a subordinate clause:
(10) a. John wanted his stitches out rapidly.
b. ? The dog enjoyed his food rapidly.
More recently, Larson (2011) provided additional independent evidence for a
hidden clausal structure as a uniform complement of want (and other in-
tensionality-inducing verbs). Importantly, the clausal analysis that Larson ar-
gues for derives from a hypothesis on the semantics of verbs like want; it
therefore predicts (successfully) the existence of similar phenomena also in
other languages, insofar as volitional predicates can be identified. It should be
noted that Larson
’s syntactic analysis (like Cable’s) does not incorporate all the
assumptions of Harley
’s original Distributed-Morphological account.
At least for certain verbal predicates, then, a decompositional analysis is
empirically well established and, more importantly, not limited to any one tech-
nical framework. If a notion of
“lexical item” is revealed as oversimplistic for at
least those cases, on language-internal grounds, it is at least a reasonable hope
to see these results subjected to critical assessment on experimental grounds,
by psycho- and neurolinguistic approaches to the mental lexicon. A failure to
Models of lexical meaning
391
take them into account leads to attributing properties (content, priming poten-
tial, ease of retrieval) to assumed
“lexical items” whose existence is in fact not
motivated outside of the morphological or phonological level.
Beside this general point, which is enough to cast doubt on naive approaches
to lexical semantics that simplistically assume
“words”, interdisciplinary perspec-
tives arise more specifically in connection with coercion. This label groups together
various phenomena of polysemy in context, which evidently have a great impor-
tance for a proper understanding of lexical knowledge as a psychological phenom-
enon and its neurological grounding. If linguistic data can shed light on the way
lexical knowledge is structured and distributed over formal representations (say,
with the morphosyntactic representation want [a cigar] mapped to an abstract
clausal structure WANT [HAVE CIGAR]), psycholinguistic investigations are indis-
pensable for understanding the dynamic aspect of this phenomenon: what distinct
sense components are activated in processing, for instance, and how do they relate
to non-linguistic background knowledge (if a clear divide can be drawn)? The very
fact that, for instance, end and finish have different coercion properties shows that
contextual flexibility varies lexically and does not entirely reduce to encyclopedic
inferences; at the same time, however, we need to know how much of the informa-
tion that goes into activating different senses is a function of linguistic knowledge,
and how much of it derives from non-linguistic knowledge
– if the two can be dis-
criminated, something which grammatical theory alone cannot verify. Similarly, it
is well known that languages with a clear mass-count opposition in nominals dif-
fer in how easily they allow nouns to be coerced into a non-favored interpretation
(as in there is still a lot of car to paint), a fact which highlights the language- and
grammar-dependent nature of this type of coercion. A traditional approach would
take for granted that synonyms like car and voiture are also directly comparable in
terms of the conceptual content they express (and so, differences in flexibility
must depend on grammar). But there is no clearcut divide between
“grammar”
and
“lexical item” in most decompositional accounts; the asymmetry in linguistic
flexibility derives from properties of the grammatical representation which are di-
rectly reflected in the conceptual content of these nouns. It would be extremely in-
structive to complement this theoretical stance with observable evidence
suggestive of asymmetries in conceptual representation, or in the possibility to ac-
tivate certain senses in a given context of use.
The flexibility of word interpretations in contexts has been extensively investi-
gated in distributional semantics. Erk and Padó (2008) use a Distributional
Semantic Model to address a crucial aspect of compositionality, namely the fact
that when words are composed, they tend to affect each other
’s meanings.
This phenomenon is related to what Pustejovky (1995) refers to as
“co-
compositionality
”. For instance, the meaning of run in The horse runs is different
392
Paolo Acquaviva et al.
from its meaning in The water runs (Kintsch 2001). Erk and Padó (2008) claim that
words are associated with various types of expectations (e.g., typical events for
nouns, and typical arguments for verbs) that influence each other when words
compose, thereby altering their meaning (McRae et al. 2005b). They model this
context-sensitive compositionality by distinguishing the lemma vector of a word w
1
(i.e., its out-of-context representation), from its vector in the context of another
word w
2
. The vector-in-context for w
1
is obtained by combining the lemma vector
of w
1
with the lemma vectors of the expectations activated by w
2
. For instance, the
vector-in-context assigned to run in The horse runs is obtained by combining the
lemma vector of run with the lemma vectors of the most typical verbs in which
horse appears as a subject (e.g. gallop, trot, etc.). Like in Mitchell and Lapata
(2010), various functions to build vectors in contexts are tested. Erk and Padó
(2008) evaluate their model for context-sensitive vector representation to predict
verb similarity in context (for instance slump in the context of shoulder is more
similar to slouch than to decline) and to rank paraphrases.
Distributional analyses have also been proposed for cases of coercion like
(7b) and (9) (Lapata and Lascarides 2003; Zarcone et al. 2013; Chersoni et al.
2017). Such models assume that the retrieved event (like
“reading” in The man
began the book) is the event most compatible with corpus-derived knowledge
about typical events and their participants. This is in contrast to traditional ac-
counts of coercion (Pustejovsky 1995) which ascribe covert event retrieval to com-
plex lexical entries associating entities with events corresponding to their typical
function or creation mode (e.g., qualia roles). Distributional semantics can thus
provide a more economical and general explanation of phenomena like coercion
that challenge formal models of compositionality. Moreover, the advantage of
distributional approaches to coercion is that they can account for psycholinguistic
evidence showing the influence of context on the interpretation of coercion sen-
tences (Zarcone et al. 2014). For example, given baker and child as subjects of fin-
ish the icing, baker will cue spread as a covert event, while child will cue eat (even
though it is perfectly possible that bakers eat icing or that children spread it).
Generally speaking, hypotheses framed in the terms of grammatical theo-
ries tend to lack independent evidence when it comes to deciding not how to
model linguistic information, but whether some information is part of linguistic
knowledge or not. The very notion of
“sense” could be brought into sharper
focus by crossing the results of formal linguistic and experimental investiga-
tions, so that what counts as a meaning
“component” for grammatical analysis
is at the same time independently validated on psycholinguistic grounds, and
vice versa. In turn, an independently validated delineation of senses would
prove useful in solving the central question whether speakers represent them
as a continuum, or whether they are grouped together under a general category
Models of lexical meaning
393
corresponding to a semantic item of the mental lexicon
– and in that case,
whether this is stored and assessed as a listeme, and to what extent its content
coincides with what is posited on purely language-internal grounds.
These are, as is clear, just a few suggestions on how a closer synergy between
linguistic, psycholinguistic, and neurolinguistic approaches to lexical semantic co-
ercion could contribute to a better understanding of the mental lexicon.
4 Conclusion
The positions outlined in this chapter illustrate different, quite often incompati-
ble perspectives on lexical semantics. In this they reflect the considerable diver-
sity that characterizes linguistics as a whole. The chapter has reviewed the four
key approaches that have emerged in the study of lexical semantics, with the
goal of clarifying their historical background, their specific differences, the
methodological and theoretical assumptions that lie behind those differences,
and the main strengths and the main challenges of each perspective.
A certain degree of complementarity is inevitable in such a diverse theoreti-
cal landscape. It should be noted that behind each of the main perspectives lies
a vast number of studies and often quite divergent priorities. When we move
away from fundamental assumptions and programmatic goals, it becomes clear
that the various perspectives prove anything but equivalent in their ability to
successfully deal with the various aspects of lexical knowledge such as synon-
ymy and antonymy, attested ranges of lexicalization patterns, compositionality
of meaning in complex words, paradigmatic patterns across related lexical
items, family-resemblance effects, context-induced malleability, flexibility of
meaning in use and context-invariant patterns. The questions that arise in the
study of the mental lexicon and of lexical structures bring this complementarity
into sharp focus. Over and above the requirements of a linguistic theory of lexi-
cal knowledge, the various approaches must provide an analytical framework
that can be naturally compared, and preferably reconciled, with the results of
psycholinguistic and neurolinguistic investigation.
It would be wrong, however, to see linguistic theories of lexical meaning as
inevitably incomplete rival models, in need of validation from mind and brain
sciences. Psychological and neurological methods of analysis cannot lead to
useful results about the relation between cognition and language, and specifi-
cally of lexical knowledge, without assuming a model of what lexical knowl-
edge consists of: how it is organized, what its semantic building blocks are,
what a
‘lexical item’ is precisely, what the role of context and of non-linguistic
394
Paolo Acquaviva et al.
knowledge is, and how these aspects relate to background assumptions about
linguistic meaning. The models of lexical meaning we have reviewed articulate
different answers to this type of question, and in their ongoing development
they have amassed a wealth of results and partial conclusions that deserve to
be integrated (and challenged) by any investigation of the nature of lexical
meaning.
References
Aarts, Bas. 2004. Modelling linguistics gradience. Studies in language 28 (1). 1
–49.
Acquaviva, Paolo, & Phoevos Panagiotidis 2012. Lexical decomposition meets conceptual
atomism. Lingue e Linguaggio XI (2). 165
–180.
Apresjan, Jurij D. 1966. Analyse distributionnelle des significations et champs sémantiques
structurés. Langages 1 (1). 44
–74.
Adger, David. 2015a. Mythical myths: Comments on Vyvyan Evans
’ “The Language Myth”.
Lingua 158. 76
–80.
Adger, David. 2015b. More misrepresentation: A response to Behme and Evans 2015. Lingua
162. 160
–166.
Aikhenvald, Alexandra. 2002. Classifiers. Cambridge: Cambridge University Press.
Alexiadou, Artemis & Monika Rathert (eds.). 2010. The syntax of nominalizations across
languages and frameworks. Berlin & New York: De Gruyter.
Arad, Maya. 2003. Locality constraints on the interpretation of roots: the case of Hebrew
denominal verbs. Natural Language and Linguistic Theory 21. 737
–778.
Asher, Nicholas. 2011. Lexical meaning in context. Cambridge: Cambridge University Press.
Baldinger, Kurt. 1984. Vers une sémantique moderne. Paris: Klincksieck.
Baroni, Marco & Alessandro Lenci. 2010. Distributional Memory: A General Framework for
Corpus-Based Semantics. Computational Linguistics 36 (4). 673
–721.
Behme, Christina & Vyvyan Evans. 2015. Leaving the myth behind: A reply to Adger (2015).
Lingua 162. 149
–159.
Benczes, Reka, Antonio Barcelona & Francisco Ruiz de Mendoza Ibáñez (eds.). 2011. Defining
metonymy in Cognitive linguistics: Towards a consensus view. Amsterdam: John Benjamins.
Bierwisch, Manfred & Robert Schreuder. 1992. From concepts to lexical items. Cognition 42.
23
–60.
Booij, Geert. 2010. Construction Morphology. Oxford: Oxford University Press.
Borer, Hagit. 2005a. In Name Only. Oxford: Oxford University Press.
Borer, Hagit. 2005b. The normal course of events. Oxford: Oxford University Press.
Borer, Hagit. 2013. Taking form. Oxford: Oxford University Press.
Berlin, Brent & Paul Kay. 1969. Basic Color Terms: Their Universality and Evolution. Berkeley,
CA: University of California Press.
Brown, Susan Windisch. 2008. Polysemy and the mental lexicon. Colorado Research in
Linguistics 21. 1
–12.
Budanitsky, Alexander & Graeme Hirst. 2006. Evaluating WordNet-based measures of lexical
semantic relatedness. Computational Linguistics 32. 13
–47.
Models of lexical meaning
395
Cable, Set. 2011. A New Argument for Lexical Decomposition: Transparent Readings of Verbs.
Linguistic Inquiry 42. 131
–138.
Chersoni, Emmanuele, Alessandro Lenci & Philippe Blache. 2017. Logical Metonymy in a
Distributional Model of Sentence Comprehension. In Proceedings of the 6th Joint
Conference on Lexical and Computational Semantics (*SEM 2017). 168
–177.
Chierchia, Gennaro. 1998. Reference to Kinds across Languages. Natural Language Semantics
6. 339
–405.
Chierchia, Gennaro. 2010. Mass nouns, vagueness and semantic variation. Synthese 174. 99
–149.
Chung, Sandra. 2000. On reference to kinds in Indonesian. Natural Language Semantics 8 (2).
157
–171.
Conklin, Harold C. 1973. Color categorization. American Anthropologist 75. 931
–942.
Comrie, Bernard & Greville G. Corbett (eds.). 1993. The Slavonic Languages. London:
Routledge.
Connolly, Andrew C., Lila R Gleitman & Sharon L. Thompson-Schill. 2007. Effect of congenital
blindness on the semantic representation of some everyday concepts. Proceedings of the
National Academy of Sciences of the United States of America 104 (20). 8241
–8246.
Coseriu, Eugenio. 1973. Sincronía, Diacronía e Historia
– El problema del cambio lingüístico.
Madrid: Editorial Gredos, S.A.
Coseriu, Eugenio. 2000. Structural semantics and
“cognitive” semantics. Logos and Language
1
–1. 19–42.
Cree, George S., Ken McRae & Chris McNorgan. 1999. An attractor model of lexical conceptual
processing: simulating semantic priming. Cognitive Science 23 (3). 371
–414.
Croft, William & D. Alan Cruse. 2004. Cognitive Linguistics. Cambridge: Cambridge University
Press.
Croft, William. 2012. Verbs: aspect and causal structure. Oxford: Oxford University Press.
Cruse, D. Alan. 1986. Lexical semantics. Cambridge: Cambridge University Press.
Cruse, Alan. D. 2011. Meaning in Language. Oxford: Oxford University Press.
Curran, James R. 2003. From Distributional to Semantic Similarity. PhD thesis, University of
Edinburgh.
Dembitz,
Šandor, Gordan Gledec & Mladen Sokele. 2014. An economic approach to big data in
a minority language. Procedia Computer Science 35. 427
–436.
Divjak, Dagmar. 2010. Structuring the Lexicon: a Clustered Model for Near-Synonymy. Berlin:
De Gruyter
Dixon, R. M. W. & Alexandra Y. Aikhenvald. 2009. Adjective classes: A cross-linguistic
typology. Oxford: Oxford University Press.
Dölling, Johannes & Tatjana Heyde-Zybatow. 2007. Verb Meaning: How much Semantics is in
the Lexicon? In Andreas Späth (ed.), Interface and interface Conditions, 33
–76. Berlin: de
Gruyter.
Dowty, David. 1979. Word meaning and Montague grammar. Dordrecht: Kluwer.
Eckardt, Regine. 2006. Meaning change in grammaticalization: An inquiry into semantic
reanalysis. Oxford: Oxford University Press.
Engelberg, Stefan. 2011. Frameworks of lexical decomposition of verbs. In Claudia Maienborn,
Klaus von Heusinger & Paul Portner (eds.), Semantics: An international handbook of
natural language meaning, Vol. 1, 358
–399. Berlin: Mouton de Gruyter.
Erk, Katrin & Sebastian Padó 2008. A structured vector space model for word meaning in
context. In Proceedings of EMNLP 08. 897
–906.
396
Paolo Acquaviva et al.
Evans, Vyvyan. 2014. The language myth: Why language is not an instinct. Cambridge:
Cambridge University Press.
Fábregas, Antonio & Sergio Scalise. 2012. Morphology. From data to theories. Edinburgh:
Edinburgh University Press.
Fellbaum, Christiane (ed). 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT
Press.
Fillmore, Charles J. & Paul Kay. 1995. Construction grammar. Berkeley: Ms., University of
California, Berkeley.
Fillmore, Charles. 2006. Frame semantics. In Dirk Geeraerts (ed.), Cognitive Linguistics: basic
readings, 373
–400. Berlin: Mouton de Gruyter.
Firth, John R. 1951. Modes of meaning. In John R. Firth (ed.), Essays and Studies [Reprinted in
Papers in Linguistics 1934
–1951], 190–215. London: Oxford University Press.
Firth, J. R. 1957. A synopsis of linguistic theory, 1930
–1955. Studies in linguistic analysis,
1
–32. Oxford: Philological Society.
Fodor, Jerry & Ernie Lepore. 1999. Impossible words? Linguistic Inquiry 30. 445
–453.
Gärdenfors, Peter. 2000. Conceptual Spaces: On the Geometry of Thought. Cambridge, MA:
MIT Press.
Gärdenfors, Peter. 2014. The geometry of meaning: semantics based on conceptual spaces.
Cambridge, MA: MIT Press.
Geeraerts, Dirk. 1997. Diachronic Prototype Semantics
– A Contribution to Historical
Lexicology. Oxford: Clarendon Press.
Geeraerts, Dirk. 2010. Theories of lexical semantics. Oxford: Oxford University Press.
Gerner, Matthisa. 2014. Noncompositional scopal morphology in Yi. Morphology 24. 1
–24.
Gibbs, Raymond. 1994. The poetics of mind. Figurative thought, language, and understanding.
New York: Cambridge University Press.
Giora, Rachel. 2003. On our mind: salience, context and figurative language. New York: Oxford
University Press.
Landau, Barbara & Lila R. Gleitman. 1985. Language and experience. Evidence from the Blind
Child. Cambridge, MA: Harvard University Press.
Goldberg, Adele. 1995. Constructions: A construction grammar approach to argument
structure. Chicago: University of Chicago Press.
Goldberg, Adele. 2006. Constructions at work: The nature of generalization in language.
Oxford: Oxford University Press.
Gonzalez-Marquez, Monica, Irene Mittelberg, Seana Coulson & Michael, J. Spivey. 2007.
Methods in Cognitive Linguistics. Amsterdam: John Benjamins.
Guiraud, Pierre. 1967. Structures étymologiques du lexique français. Paris: Larousse
Hale, Kenneth & Samuel Jay Keyser. 1999. A response to Fodor and Lepore,
“Impossible
words?
”. Linguistic Inquiry 30. 453–466.
Hale, Kenneth & Samuel Jay Keyser. 2002. Prolegomenon to a theory of argument structure.
Cambridge, MA: MIT Press.
Hale, Kenneth & Samuel Jay Keyser. 2005. Aspect and the syntax of argument structure. In
Nomi Erteschik-Shir and Tova Rapoport (eds.), The Syntax of Aspect, 11
–41. Oxford:
Oxford University Press.
Hampton, James A. 2007. Typicality, graded membership, and vagueness. Cognitive Science,
31, 355
–383.
Hanks, Patrick & Rachel Giora. 2011. Metaphor and figurative language. London: Routledge
Models of lexical meaning
397
Harley, Heidi. 2004. Wanting, Having, and Getting: A Note on Fodor and Lepore 1998.
Linguistic Inquiry 35. 255
–267.
Harley, Heidi. 2012. Semantics in Distributed Morphology. In Claudia Maienborn, Klaus von
Heusinger & Paul Portner (eds.), Semantics: An international handbook of natural
language meaning, volume 3 (HSK 33.3), 2151
–2172. Berlin: Mouton de Gruyter.
Harley, Heidi. 2014. On the identity of roots. Theoretical Linguistics 40 (3/4). 225
–276.
Harris, Zellig S. 1954. Distributional structure. Word 10 (2
–3). 146–162.
Harris, Zellig S. 1991. A Theory of Language and Information: A Mathematical Approach.
Oxford: Clarendon Press.
Hinton, Geoffrey E., James L. McClelland & David E. Rumelhart. 1986. Distributed
representations. In David E. Rumelhart & James L. McClelland (eds), Parallel Distributed
Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations,
77
–109. Cambridge, MA: MIT Press.
Hinzen, Wolfram. 2007. An essay on names and truth. Oxford: Oxford University Press.
Jackendoff, Ray. 2010. Meaning and the lexicon: The parallel architecture 1975
–2010. Oxford:
Oxford University Press.
Jackendoff, Ray. 1990. Semantic structures. Cambridge, MA: MIT Press.
Jackendoff, Ray. 2002. Foundations of language. Oxford: Oxford University Press.
Jackendoff, Ray. 2011. Conceptual semantics. In Klaus von Heusinger, Claudia Maienborn, &
Paul Portner (eds.), Semantics: An onternational handbook of natural language meaning,
volume 1, 688
–709. Berlin: Mouton de Gruyter.
Jakobson, Roman. 1959. On linguistic aspects of translation. In Reuben A. Brower (ed.), On
Translation, 232
–239. Cambridge, MA: Harvard University Press.
Jones, Michael N., Jon A. Willits & Simon Dennis. 2015. Models of Semantic Memory. In Jerome
R. Busemeyer, Zeng Whang, James T. Townsend & Ami Eidels (eds.), Oxford Handbook of
Mathematical and Computational Psychology, 232
–254. Oxford: Oxford University Press.
Jones, Steven, M. Lynne Murphy, Carita Paradis & Caroline Willners. 2012. Antonyms in English:
construals, constructions and canonicity. Cambridge: Cambridge University Press.
Kaufmann, Ingrid. 1995. What is an (im)possible verb? Restrictions on semantic form and their
consequences for argument structure. Folia Linguistica 29. 67
–103.
Kay, Paul, Brent Berlin, Luisa Maffi, William R. Merrifield & Richard Cook. 2009. The World
Colour Survey. Stanford: CSLI Publications
Kintsch, Walter. 2001. Predication. Cognitive Science 25 (2). 173
–202.
Kleiber, Georges. 1978. Le mot
“ire” en ancien français (XIe-XIIie siècles) – Essai d’analyse
sémantique. Paris: Klincksieck.
Klein, Devorah & Gregory Murphy. 2001. The Representation of Polysemous Words. Journal of
Memory and Language 45. 259
–282.
Koontz-Garboden 2005, On the typology of state/change of state alternations. Yearbook of
Morphology 2005, 83
–117. Dordrecht: Springer.
Lakoff, George & Mark Johnson. 1980. Metaphors we live by. Chicago: Chicago University Press.
Lakoff, George. 1987. Women, fire and dangerous things. Chicago: Chicago University Press.
Lakoff, George. 1990. The invariance hypothesis. Cognitive Linguistics, 1(1). 39
–74.
Landauer, Thomas K. & Susan Dumais. 1997. A solution to Plato
’s problem: The latent
semantic analysis theory of acquisition, induction, and representation of knowledge.
Psychological Review 104 (2). 211
–240.
Langacker, Ronald. 1987. Foundations of cognitive grammar. Stanford: Stanford University Press.
398
Paolo Acquaviva et al.
Langacker, Ronald. 1998. Conceptualization, Symbolization, and Grammar. In Michael
Tomasello (ed.), The New Psychology of Language: Cognitive and Functional Approaches
to Language Structure, 1
–39. Mahwah, NJ and London : Erlbaum.
Lapata, Mirella & Alex Lascarides. 2003. A probabilistic account of logical metonymy.
Computational Linguistics 29 (2). 263
–317.
Larson, Richard. 2011. Clauses, propositions and phases. In Anna-Maria DiSciullo & Cedric
Boeckx (eds.). The biolinguistic rnterprise: New perspectives on the evolution and nature
of the human language faculty, 366
–391. Oxford: Oxford University Press.
Larson, Richard, & Gabriel Segal. 1994. Knowledge of meaning. Cambridge, MA: MIT Press.
Lehrer, Adrienne. 1974. Semantic fields and lexical structure. Amsterdam: North Holland.
Lenci, Alessandro. 2008. Distributional approaches in linguistic and cognitive research.
Italian Journal of Linguistics 20 (1). 1
–31
Lenci, Alessandro. 2018. Distributional models of word meaning. Annual Review of Linguistics
4. 151
–171.
Lenci, Alessandro, Marco Baroni & Giulia Cazzolli. 2013. Una prima analisi delle norme
semantiche BLIND. In Giovanna Marotta, Linda Meini & Margherita Donati (eds.), Parlare
senza vedere: Rappresentazioni semantiche nei non vedenti, 83
–93. Pisa, ETS.
Lenci, Alessandro, Marco Baroni, Giulia Cazzolli & Giovanna Marotta. 2013. BLIND: a set of
semantic feature norms from the congenitally blind. Behavior Research Methods 45(4).
1218
–1233.
Levin, Beth. 2011. Conceptual categories and linguistic categories I: Introduction. http://web.
stanford.edu/~bclevin/lsa11intro.pdf (Accessed on 29/4/2019).
Levin, Beth & Malka Rappaport Hovav. 2011. Lexical conceptual structure. In Klaus von
Heusinger, Claudia Maienborn, & Paul Portner (eds.), Semantics: An international
handbook of natural language meaning, volume 1, 418
–438. Berlin: Mouton de Gruyter.
Levy, Omer & Yoav Goldberg. 2014. Linguistic regularities in sparse and explicit word
representations. In Proceedings of the Eighteenth Conference on Computational
Language Learning. 171
–180.
Libben, Gary & Silke Weber. 2014. Semantic transparency, compounding, and the nature of
independent variables. In Franz Rainer, Francesco Gardani, Hans Christian Luschütsky &
Wolfgang U. Dressler (eds.), Morphology and meaning, 205
–222. Amsterdam &
Philadelphia: John Benjamins.
Lieber, Rochelle. 2004. Morphology and Lexical Semantics. Cambridge: Cambridge University
Press.
Longobardi, Giuseppe. 2001. How comparative is semantics? A unified parametric theory of
bare nouns and proper names. Natural Language Semantics 9/4. 335
–369.
Lucy, John. A. 1997. The linguistics of
“color.” In Clyde Laurence Hardin & Luisa Maffin (eds.),
Color categories in thought and language, 320
–346. Cambridge: Cambridge University
Press.
Lund, Kevin & Curt Burgess. 1996. Producing high-dimensional semantic spaces from lexical
co-occurrence. Behavior Research Methods, Instruments, & Computers 28. 203
–208.
Lyons, John. 1977/1993. Semantics. Cambridge: Cambridge University Press.
Majid, Asifa, Fiona Jordan & Michael Dunn. 2015. Semantic systems in closely related
languages. Language Sciences 49 (1). 1
–18.
Majid, Asifa & Stephen C. Levinson. 2007. The language of vision I: Colour. In Asifa Majid
(ed.), Field manual, vol. 10, 22
–25. Nijmegen: Max Planck Institute for Psycholinguistics.
Models of lexical meaning
399
Malt, Barbara C. & Asifa Majid. 2013. How thoughts is mapped into words. WIREs Cognitive
Science 4 (6). 583
–597.
Marantz, Alec. 1997. No Escape from syntax: Don
’t try morphological analysis in the privacy of
your own lexicon. In Alexis Dimitriadis, Laura Siegel, Clarissa Surek-Clark & Alexander
Williams (eds.), Proceedings of the 21st Annual Penn Linguistics Colloquium: Penn
Working Papers in Linguistics 4.2, 201
–225.
Markman, Arthur B. 1999. Knowledge Representation, Mahwah, NJ: Lawrence Erlbaum
Associates.
Marmor, Gloria S. 1978. Age at onset of blindness and the development of the semantics of
color names. Journal of Experimental Child Psychology 25 (2). 267
–278.
Martinet, André. 1989. Reflexions sur la signification. La linguistique
– Sens et signification
25. 43
–51
Massam, Diane (ed.). 2012. Count and mass across languages. Oxford: Oxford University Press.
McRae, Ken, Virginia R. de Sa & Mark S. Seidenberg. 1997. On the nature and scope of
featural representations of word meaning. Journal of Experimental Psychology: General
126 (2). 99
–130.
McRae, Ken, Mary Hare, Jeffrey L. Elman & Todd Ferretti. 2005b. A basis for generating
expectancies for verbs from nouns. Memory & Cognition 33 (7). 1174
–1184.
McRae, Ken, George S. Cree, Mark S. Seidenberg & Chris McNorgan. 2005a. Semantic feature
production norms for a large set of living and nonliving things. Behavior Research
Methods 37 (4). 547
–559
Meillet, Antoine. 1958. Comment les mots changent de sens. Linguistique historique et
linguistique générale 1. 230
–271.
Mikolov, Tomas, Kai Chen, Greg Corrado & Jeffrey Dean. (2013). Efficient Estimation of Word
Representations in Vector Space. In Proceedings of the International Conference on
Learning Representations. 1
–12.
Miller, George & Walter Charles. 1991. Contextual correlates of semantic similarity. Language
and Cognitive Processes 6 (1). 1
–28.
Mitchell, Jeff & Mirella Lapata. 2010. Composition in distributional models of semantics.
Cognitive Science 34(8). 1388
–1429.
Murphy, Gregory & Jane Andrew. 1993. The conceptual basis of antonymy and synonymy in
adjectives. Journal of Memory and Language, 32. 301
–319.
Murphy, Gregory. 2002. The big book of concepts. Cambridge, MA: MIT Press.
Murphy, Gregory. 2007. Parsimony and the psychological representation of polysemous words.
In Marina Rakova, Gergely Peth
ő, & Csilla Rakosi (eds.), The cognitive bases of polysemy:
New sources of rvidence for theories of word meaning, 47
–70. Frankfurt: Peter Lang.
Murphy, M. Lynne. 2003. Semantic relations and the lexicon. Cambridge: Cambridge
University Press.
Murphy, M. Lynne, Paradis, Carita, & Caroline Willners. 2009. Discourse functions of
antonymy: a cross-linguistic investigation of Swedish and English. Journal of pragmatics,
41(11). 2159
–2184.
Osgood, Charles E. 1952. The nature and measurement of meaning. Psychological Bulletin 49.
197
–237.
Osgood, Charles E., George J. Suci & and Percy H. Tannenbaum. 1957. The Measurement of
Meaning. Urbana, IL: University of Illinois Press.
400
Paolo Acquaviva et al.
Ouhalla, Jamal 2012. Lexical change and the architecture of the lexicon. In Esther Torrego (ed),
Of Grammar, Words, and Verses. In Honor of Carlos Piera, 41
–66. Amsterdam: John
Benjamins.
Padó, Sebastian & Mirella Lapata. 2007. Dependency-based construction of semantic space
models. Computational Linguistics 33 (2). 161
–199.
Panther, Klaus-Uwe & Linda Thornburg. 2003. Metonymy and pragmatic inferencing.
Amsterdam: John Benjamins.
Paradis, Carita. 2000. Reinforcing Adjectives: A cognitive semantic perspective on
grammaticalization. In Ricardo Bermúdez-Otero, David Denison, Richard M. Hogg &
Christopher B. McCully (eds.), Generative Theory and Corpus Studies, 233
–258. Berlin/
New York: Mouton de Gruyter.
Paradis, Carita. 2001. Adjectives and boundedness. Cognitive Linguistics 12 (1). 47
–65.
Paradis, Carita. 2003. Is the notion of linguistic competence relevant in Cognitive Linguistics?
Annual Review of Cognitive Linguistics 1. 207
–231.
Paradis, Carita. 2004. Where does metonymy stop? Senses, facets and active zones.
Metaphor and symbol, 19 (4) 245
–264.
Paradis, Carita. 2005. Ontologies and construals in lexical semantics. Axiomathes 15. 541
–573.
Paradis, Carita 2008. Configurations, construals and change: expressions of degree. English
Language and Linguistics 12 (2). 317
–343.
Paradis, Carita. 2011. Metonymization: key mechanism in language change. In Reka Benczes,
Antonio Barcelona & Fransisco Ruiz de Mendoza Ibáñez (eds.). Defining metonymy in
Cognitive Linguistics: Towards a Consensus View, 61
–88. Amsterdam: John Benjamins.
Paradis, Carita. 2012. Lexical semantics. In Carol A. Chapelle (ed.), The encyclopedia of
applied linguistics, 690
–697. Oxford: Wiley-Blackwell.
Paradis, Carita. 2015. Meanings of words: Theory and application. In Ulrike Hass & Petra
Storjohann (eds.) Handbuch Wort und Wortschatz (Handbücher Sprachwissen-HSW Band
3), 274
–294. Berlin: Mouton de Gruyter, Berlin.
Paradis, Carita, Caroline Willners & Steven Jones. 2009. Good and bad opposites: using
textual and psycholinguistic techniques to measure antonym canonicity. The Mental
Lexicon, 4(3). 380
–429.
Peirsman Yves, Kris Heylen & Dirk Speelman. 2007. Finding semantically related words in
Dutch. Cooccurrences versus syntactic contexts. In Marco Baroni, Alessandro Lenci and
Magnus Sahlgren (eds.), Proceedings of the 2007 Workshop on Contextual Information in
Semantic Space Models: Beyond Words and Documents, 9
–16.
Peirsman, Yves & Dirk Speelman. 2009. Word space models of lexical variation. In Roberto
Basili & Marco Pennacchiotti (eds.), Proceedings of the EACL GEMS Workshop, 9
–16.
Perek, Florient. 2016. Using distributional semantics to study syntactic productivity in
diachrony: A case study. Linguistics 54 (1). 149
–188.
Pustejovsky, James. 1995. The generative lexicon. Cambridge, MA: MIT Press.
Pylkkänen, Liina, Rodolfo Llinás & Gregory Murphy. 2006. The representation of polysemy:
MEG evidence. Journal of Cognitive Neuroscience 18. 97
–109.
Raffaelli, Ida. 2009. Zna
čenje kroz vrijeme: poglavlja iz dijakronijske semantike. [Meaning
through time: chapters in diachronic semantics] Zagreb: Disput.
Raffaelli, Ida, Jan Chromý, Anetta Kopecka. 2019. Lexicalization patterns in color naming in
Croatian, Czech and Polish. In Raffaelli, Ida, Daniela Katunar & Barbara Kerovec (eds.),
Lexicalization patterns in color naming: a cross-linguistic perspective. Amsterdam: John
Benjamins
Models of lexical meaning
401
Rainer, Franz. 2014. Polysemy on derivation. In Rochelle Lieber & Pavol
Štekauer (eds.), The
Oxford handbook of derivational morphology, 338
–353. Oxford: Oxford University Press.
Ramchand, Gillian. 2008. Verb meaning and the lexicon: A first-phase syntax. Cambridge:
Cambridge University Press.
Rappaport Hovav, Malka & Beth Levin. 1998. Building verb meanings. In Miriam Butt & Willi
Geuder (eds.), The Projection of arguments: Lexical and compositional factors, 97
–134.
Stanford, Ca: CSLI Publications.
Rappaport Hovav, Malka & Beth Levin. 2010. Reflections on manner/result complementarity.
In Edit Doron, Malka Rappaport Hovav & Ivy Sichel (eds.), Syntax, lexical semantics, and
event structure, 21
–38. Oxford: Oxford University Press.
Rieger, Terry, Paul Kay & Naveen Ketharpal. 2007. Color naming reflects optimal partitions of
color space. Proceedings of the National Academy of Sciences of the United States of
America 104(4). 1436
–1441.
Riemer, Nick. 2013. Conceptualist semantics: explanatory power, scope and uniqueness.
Language Sciences 35. 1
–19.
Riemer, Nick. 2016. Internalist semantics: meaning, conceptualization and expression. In Nick
Riemer (ed.), The Routledge handbook of semantics, 30
–47. London: Routledge.
Rogers, Timothy T. & James L. McClelland. 2004. Semantic Cognition: A Parallel Distributed
Processing Approach. Cambridge MA: MIT Press.
Rosch, Eleanor. 1973. Natural categories. Cognitive psychology 4. 328
–350.
Rosch, Eleanor. 1975. Cognitive representations of semantic categories. Journal of
Experimental Psychology: General 104. 192
–233.
Rothstein, Susan. 2004. Structuring events: A study in the semantics of lexical aspect. Oxford:
Blackwell.
Rothstein, Susan. 2010. Counting and the mass-count distinction. Journal of Semantics 27.
343
–397.
Sagi, Eyal, Stefan Kaufmann & Brady Clark. 2009. Semantic density analysis: Comparing word
meaning across time and phonetic space. In Proceedings of the EACL GEMS Workshop.
104
–111.
Sahlgren, Magnus. 2006. The Word-Space Model. Using Distributional Analysis to Represent
Syntagmatic and Paradigmatic Relations between Words in High- Dimensional Vector
Spaces. PhD thesis, Stockholm University.
Saussure, Ferdinand de. 1959/1986. Course in general linguistics. London: Peter Owen.
Schwarze, Christoph & Marie-Therese Schepping. 1995. Polysemy in a two-level semantics. In
Urs Egli, Peter E. Pause, Christoph Schwarze, Arnim von Stechow & Götz Wienold (eds.),
Lexical knowledge and the organization of the language, 275
–300. Amsterdam: John
Benjamins.
Senft, Gunter (ed.). 2000. Systems of Nominal Classification. Cambridge: Cambridge
University Press.
Shepard, Roger N. & Lynn A. Cooper. 1992. Representation of Colors in the Blind, Color-Blind,
and Normally Sighted. Psychological Science, 3(2),97
–104.
Sinclair, John. 1987. Looking up: an account of the COBUILD project in lexical computing and
the development of the Collins COBUILD English Language Dictionary. London: Harper
Collins.
Smith, Edward E. & Douglas L. Medin. 1981. Categories and Concepts. Cambridge, MA:
Harvard University Press.
402
Paolo Acquaviva et al.
Storjohann, Petra. 2010. Synonyms in corpus texts: Conceptualisation and construction. In:
Petra Storjohann (ed.), Lexical-Semantic Relations: Theoretical and Practical
Perspectives, 69
–94 Amsterdam: John Benjamins.
Svenonius, Peter. 2008. The position of adjectives and other phrasal modifiers in the
decomposition of DP. In Louise McNally & Chris Kennedy (eds.), Adjectives and adverbs:
Syntax, semantics, and discourse, 16
–42. Oxford: Oxford University Press.
Talmy, Leonard. 1985. Lexicalization patterns. In Timothy Shopen (ed.), Language typology
and syntactic description, volume 3, 57
–149. Cambridge: Cambridge University Press.
Talmy, Leonard. 2000. Toward a cognitive semantics. Cambridge, MA: MIT Press.
Taylor, John. 2003. Linguistic Categorization. Oxford: Oxford University Press.
Tomasello, Michael. 2003. Constructing a language: a usage-based theory of language
acquisition. Cambridge, MA: Harvard University Press.
Tomasello, Michael. 2008. Origins of human communication. Cambridge, MA: MIT Press.
Traugott, Elisabeth & Richard B. Dasher. 2001. Regularity in semantic change. Cambridge:
Cambridge University Press.
Traugott, Elizabeth & Graeme Trousdale. 2013. Constructionalization and Constructional
Changes. Oxford: Oxford University Press.
Trier, Jost. 1931. Der deutsche Wortschatz in Sinnezirk der Verstandes: Die Geschihte eines
sprachlichen Feldes, I von den Anfängen bis zum Beginn des 13Jh. Heidelberg: Winter
Tsujimura, Natsuko. 2014. Mimetic verbs and meaning. In Franz Rainer, Francesco. Gardani,
Hans Christian Luschütsky & Wolfgang U. Dressler (eds.), Morphology and meaning,
303
–314. Amsterdam & Philadelphia: John Benjamins.
Turney, Peter D. & Patrick Pantel. 2010. From frequency to meaning: Vector space models of
semantics. Journal of Artificial Intelligence Research 37. 141
–188.
Tversky, Amos. 1977. Features of similarity. Psychological Review 84(4). 327
–352.
Ullmann, Stephen. 1969. Précis de sémantique française. Bern: A. Francke.
Ullmann, Stephen. 1973. Meaning and Style. Oxford: Basil Blackwell.
Ullmann, Stephen. 1983. Semantics
– An Introduction to the Science of Meaning, Oxford:
B. Blackwell.
van de Cruys, Tim. 2008. A comparison of bag of words and syntax-based approaches for
word categorization, Proceedings of the ESSLLI Workshop on Distributional Lexical
Semantics. 47
–54.
van de Weijer, Joost, Carita Paradis, Caroline Willners & Magnus Lindgren. 2012. As lexical as
it gets: the role of co-occurrence of antonyms in a visual lexical decision experiment. In
Dagmar Divjak & Staphan Th. Gries (eds.), Frequency effects in language: linguistic
representations, 255
–279. Berlin: Mouton de Gruyter.
van de Weijer, Joost, Carita Paradis, Caroline Willners & Magnus Lindgren. 2014. Antonym
canonicity: temporal and contextual manipulations. Brain & Language, 128(1) 1
–8.
Verhagen, Arie. 2005. Constructions of intersubjectivity: discourse, syntax and cogntion.
Oxford: Oxford University Press.
von Stechow, Arnim. 1995. Lexical decomposition in syntax. In Urs Egli, Peter E. Pause,
Christoph Schwarze, Arnim von Stechow & Götz Wienold (eds.), Lexical knowledge and
the organization of the language, 81
–118. Amsterdam & Philadelphia: John Benjamins.
Vigliocco, Gabriella & David P. Vinson. 2007. Semantic representation. In Gaskell, Gareth (ed.),
The Oxford Handbook of Psycholinguistics, 195
–215. Oxford: Oxford University Press.
Models of lexical meaning
403
Vigliocco, Gabriella, David P. Vinson, William Lewis & Merrill F. Garrett. 2004. Representing
the meanings of object and action words: The featural and unitary semantic space
hypothesis. Cognitive Psychology 48. 422
–488.
Wilhelm, Andrea. 2008. Bare nouns and number in Dëne S
ųliné. Natural Language Semantics
16 (1). 39
–68.
Wittgenstein, Ludwig. 1922. Tractatus Logico-Philosophicus. London: Routledge & Kegan Paul.
Translated by C.K. Ogden.
Wittgenstein, Ludwig. 1968. Philosophical investigations (translated by G.E.M. Anscombe).
Oxford: Blackwell.
Wunderlich, Dieter. 1997. Cause and the structure of verbs. Linguistic Inquiry 28. 27
–68.
Zarcone, Alessandra, Alessandro Lenci, Sebastian Padó & Jason Utt. 2013. Fitting, Not
Clashing! A Distributional Semantic Model of Logical Metonymy. In Proceedings of IWCS
2013. 404
–410.
Zarcone, Alessandra, Sebastian Padó & Alessandro Lenci. 2014. Logical Metonymy Resolution
in a Words-as-Cues Framework: Evidence From Self-Paced Reading and Probe
Recognition. Cognitive Science 38. 973
–996.
404
Paolo Acquaviva et al.
Dostları ilə paylaş: |