The performance of a neural predictor trained on phonotactics may be evaluated with different methods, depending on the particular task the network is applied to. In this section we evaluate the neural networks performing best during the pilot studies.
3.6.1.Likelihoods
The direct outcome of training the sequential prediction task is learning the successors' distribution. This will therefore be used as a basic evaluation method: the empirical context-dependent successor distribution PsL (C) will be matched against the network context dependent predictions NPsL (C). For this purpose, the output of the network will be normalized and matched against the distribution in the language data. This procedure resulted in a mean L2 (semi-Euclidean) distance of 0.063 - 0.064, where the optimal value would be zero (see Table 1, right 3 columns).19 These values are close to optimal but baseline models (completely random networks) also result in approximately 0.085 L2 distance.