November 17
Why do human languages have homophones?
Sean Trott
Department of Cognitive Science, UC San Diego
Human languages are replete with ambiguity. This is most evident in homophony––where two or more words sound the same, but carry distinct meanings. For example, the wordform “bark” can denote either the sound produced by a dog or the protective outer sheath of a tree trunk. Why would a system evolved for efficient, effective communication display rampant ambiguity? Some accounts argue that ambiguity is actually a design feature of human communication systems, allowing languages to recycle their most optimal wordforms (those which are short, frequent, and phonotactically well-formed) for multiple meanings. We test this claim by constructing five series of artificial lexica matched for the phonotactics and distribution of word lengths found in five real languages (English, German, Dutch, French, and Japanese), and comparing both the quantity and concentration of homophony across the real and artificial lexica.
Surprisingly, we find that the artificial lexica exhibit higher upper-bounds on homophony than their real counterparts, and that homophony is even more likely to be found among short, phonotactically plausible wordforms in the artificial than in the real lexica. These results suggest that homophony in real languages is not directly selected for, but rather, that it emerges as a natural consequence of other features of a language. In fact, homophony may even be selected against in real languages, producing lexica that better conform to other requirements of humans who need to use them.
We then ask whether the same is true of polysemy (in English), a form of lexical ambiguity in which the same wordform has two or more related meanings. Unlike homophony, we find that at least in English, wordforms are more polysemous than one would expect simply on account of their phonotactics and length. Combined, our findings suggest that these forms of ambiguity––homophony and polysemy––may face distinct selection pressures.