Title of the article

Yüklə 3,17 Mb.

səhifə	37/92
tarix	02.01.2022
ölçüsü	3,17 Mb.
	#2212

1 ... 33 34 35 36 37 38 39 40 ... 92

1.Introduction_¹⁸

Phonotactics concerns the organization of the phonemes in words and syllables. The phonotactic rules constrain how phonemes combine in order to form larger linguistic units (syllables and words) in that language (Laver, 1994). For example, Cohen, Ebeling & van Holk (1972) describe the phoneme combinations possible in Dutch, which will be the language in focus in this study.

Phonotactic rules are implicit in natural languages so that humans require no explicit instruction about which combinations are allowed and which are not. An explicit phonotactic grammar can of course be abstracted from the words in a language, but this is an activity linguists engage in, not language learners in general. Children normally learn a language's phonotactics in their early language development and probably update it only slightly once they have mastered the language.

Most work on language acquisition has arisen in linguistics and psychology, and that work employs mechanisms that have been developed for language, typically, discrete, symbol-manipulation systems. Phonotactics in particular has been modeled with n-gram models, Finite State Machines, Inductive Logic Programming, etc. (Tjong Kim Sang, 1998; Konstantopoulos, 2003). Such approaches are effective, but a cognitive scientist may ask whether the same success could be possible using less custom-made tools. The brain, viewed as a computational machine, exploits other principles, which have been modeled in the approach known as Parallel Distributed Processing (PDP), which was thoroughly described in the seminal work of Rumelhart & McClelland (1986). Computational models inspired by the brain structure and neural processing principles are Neural Networks (NNs), also known as connectionist models.

Learning phonotactic grammars is not an easy problem, especially when one restricts one's attention to cognitively plausible models. Since languages are experienced and produced dynamically, we need to focus on the processing of sequences, which complicates the learning task. The history of research in connectionist language learning shows both successes and failures even when one concentrates on simpler structures, such as phonotactics (Stoianov, Nerbonne & Bouma, 1998; Stoianov & Nerbonne, 2000; Tjong Kim Sang, 1995; Tjong Kim Sang & Nerbonne, 1999; Pacton, Perruchet, Fayol & Cleeremans, 2001).

This paper will attack phonotactics learning with models that have no specifically linguistic knowledge encoded a priori. The models naturally do have a bias, viz., toward extracting local conditioning factors for phonotactics, but we maintain that this is a natural bias for many sorts of sequential behavior, not only linguistic processing. A first-order Discrete Time Recurrent Neural Network (DTRNN) (Carrasco, Forcada & Neco, 1999; Tsoi & Back, 1997) will be used: the Simple Recurrent Network (SRNs) (Elman, 1988). SRNs have been applied to different language problems (Elman, 1991; Christiansen & Chater, 1999; Lawrence, Giles & Fong, 1995), including learning phonotactics (Shillcock, Levy, Lindsey, Cairns & Chater, 1993; Shillcock, Cairns, Chater & Levy, 1997). With respect to phonotactics, we have also contributed reports (Stoianov et al., 1998; Stoianov & Nerbonne, 2000; Stoianov, 1998).

SRNs have been shown capable of representing regular languages (Omlin & Giles, 1996; Carrasco et al., 1999). Kaplan & Kay (1994) demonstrated that the apparently context-sensitive rules that are standardly found in phonological descriptions can in fact be expressed within the more restrictive formalism of regular relations. We begin thus with a device which is in principle capable of representing the needed patterns.

We then simulate the language learning task by training networks to produce context-dependent predictions. We also show how the continuous predictions of trained SRNs - likelihoods that a particular token can follow the current context - can be transformed into more useful discrete predictions, or, alternatively, string recognitions.

In spite of the above claims about representability, the Back-Propagation (BP) and Back-Propagation Through Time (BPTT) learning algorithms used to train SRNs do not always find optimal solutions - SRNs that produce only correct context-dependent successors or recognize only strings from the training language. Hence, section 3 focuses on the practical demonstration that a realistic language learning task may be simulated by an SRN. We evaluate the network learning from different perspectives - grammar learning, phonotactics learning, and language recognition. The last two methods need one language-specific parameter - a threshold - that distinguishes successors/words allowed in the training language. This threshold is found with a post-training procedure, but it could also be sought interactively during training.

Finally, section 4 assesses the networks from linguistic and psycholinguistic perspectives: a static analysis extracts acquired linguistic knowledge from network weights, and the network performance is compared to humans' in a lexical decision task. The network performance, in particular the distribution of errors as a function of string position, will be compared to alternative construals of Dutch syllabic structure - following a suggestion from discussions of psycholinguistic experiments about English syllables (Kessler & Treiman, 1997).

Yüklə 3,17 Mb.

Dostları ilə paylaş:

1 ... 33 34 35 36 37 38 39 40 ... 92

Title of the article

1.Introduction18

1.Introduction_¹⁸