The following should pique the curiosity of a few linguistically inclined people as well as that, I presume, of various palaeo-creationists
Language evolved in a leap
Conflicting needs may have driven rapid development of communication.
22 January 2003
PHILIP BALL
BIG SNIP
It has been known since the 1940s that human languages do indeed show just this kind of statistical distribution of word usage - the social scientist George Kingsley Zipf spotted the power-law behaviour. But it has never been satisfactorily explained before, although Zipf himself speculated that it might represent some kind of "principle of least effort".
It would be interesting if someone, with quick and easy access to PNAS, were to tell us a bit more about this "communication jump" and what it may imply or how it can be interpreted from an evolutionary/palaeoanthropological point-of-view.
Jacques Cinq-Mars
I'm afraid this paper is more likely to be of interest to a fringe group of mathematicians than to linguists, let alone paleoanthropologists. Most mathematicians accept that Zipf's law is merely an artifact of the stochastic nature of information. But I'll recap as briefly as possible:
The law boils down to:
rf = C
where
r is the rank of a word;
f is the frequency of occurrence of the word;
C is a constant that depends on the text being analyzed.
The law thus shows a correlation between a word's rank, the number of different words and their frequency of use.
There are two alternatives. Either Zipf's law reflects a universal property of the human mind or else it merely represents some necessary consequence of the laws of probabilities (George Miller 1965). Zipf chose the former (synthetic) hypothesis and derived a Principle of Least Effort. Almost all mathematicians chose the latter (analytic) hypothesis and searched for a probabilistic explanation; according to this, Zipf's curves are merely "one way to express a necessary consequence of regarding a message source as a stochastic process" (George Miller 1965).
The principle behind Zip's law is disappointing: it can be derived as a consequence of very simple chaos. Thus texts consisting of randomly generated letters and spaces also obey the law. Indeed it has been shown that monkeys typing random letters on a keyboard produce "texts" whose "word" frequencies obey Zipf's law.
A real problem became that "mathematicians believe in Zipf's law because they think that linguists have established it to be a linguistic law, and linguists believe in it because they on their part think that mathematicians have established it to be a mathematical law" (Herdan, 1966). In truth it is neither.
Most mathematicians thereafter lost interest in any real-world properties of the "law". However, the new interest in natural language processing and computerization sparked a revival of interest in the issue as evidenced in the paper "Zipf's law and the structure and evolution of languages" (Tsonis, Schultz and Tsonis 1997). This paper attempted to distinguish the case of natural language from the case of the typing monkeys. This paper was rebutted (IMHO) by Li in a "Letters to the Editor", published in the journal Complexity in 1998. (I can go into this in more detail if anyone wants, but it's highly technical).
Further attempts have been made to rehabilitate the matter. For example, the authors of the current paper published "Zipf's Law and Random Texts" in the journal Advances in Complex Systems (Cancho and Sole, 2002), arguing that when random texts and real texts are compared through (a) the lexical spectrum and (b) the distribution of words of similar length, it can be shown that real texts fill the lexical spectrum more efficiently and regardless of the word length, suggesting that the meaningfulness of Zipf's law is high, according to them.
I suspect the current paper is a follow-on to that earlier one. The authors' viewpoint remains a minority one. Although I have not seen the curent paper, I remain very skeptical as to what if anything the authors may have succeeded in proving. From my perspective Zipf's "law" remains a peculiar property of chaotic systems, nothing more.
Harry (Skeptical1)