Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Zipf's law
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Statistical explanations== Although Zipf's Law holds for most natural languages, and even certain [[Constructed language|artificial ones]] such as [[Esperanto]]<ref name=mana2006/> and [[Toki Pona]],<ref name=skot2020/> the reason is still not well understood.<ref name=bril1959/> Recent reviews of generative processes for Zipf's law include [[Michael Mitzenmacher|Mitzenmacher]], "A Brief History of Generative Models for Power Law and Lognormal Distributions",<ref>{{cite journal |last=Mitzenmacher |first=Michael |author-link=Michael Mitzenmacher |date=January 2004 |title=A brief history of generative models for power law and lognormal distributions |journal=Internet Mathematics |volume=1 |issue=2 |pages=226–251 |doi=10.1080/15427951.2004.10129088 |doi-access=free }}</ref> and Simkin, "Re-inventing Willis".<ref>{{cite journal |last1=Simkin |first1=M.V. |last2=Roychowdhury |first2=V.P. |title=Re-inventing Willis |journal=Physics Reports |date=December 2010 |doi=10.1016/j.physrep.2010.12.004 |arxiv=physics/0601192 }}</ref> However, it may be partly explained by statistical analysis of randomly generated texts. Wentian Li has shown that in a document in which each character has been chosen randomly from a uniform distribution of all letters (plus a space character), the "words" with different lengths follow the macro-trend of Zipf's law (the more probable words are the shortest and have equal probability).<ref name=liwe1992/> In 1959, [[Vitold Belevitch]] observed that if any of a large class of well-behaved [[statistical distribution]]s (not only the [[normal distribution]]) is expressed in terms of rank and expanded into a [[Taylor series]], the first-order truncation of the series results in Zipf's law. Further, a second-order truncation of the Taylor series resulted in [[Zipf–Mandelbrot law|Mandelbrot's law]].<ref name=bele1959/><ref name=neum2011/> The [[principle of least effort]] is another possible explanation: Zipf himself proposed that neither speakers nor hearers using a given language wants<!--"wants" is grammatically correct here; do not change to "want"!--> to work any harder than necessary to reach understanding, and the process that results in approximately equal distribution of effort leads to the observed Zipf distribution.<ref name=zipf1949/><ref name=ferr2003/> A minimal explanation assumes that words are generated by [[Infinite monkey theorem|monkeys typing randomly]]. If language is generated by a single monkey typing randomly, with fixed and nonzero probability of hitting each letter key or white space, then the words (letter strings separated by white spaces) produced by the monkey follows Zipf's law.<ref>{{cite journal |last1=Conrad |first1=B. |last2=Mitzenmacher |first2=M. |title=Power Laws for Monkeys Typing Randomly: The Case of Unequal Probabilities |journal=IEEE Transactions on Information Theory |date=July 2004 |volume=50 |issue=7 |pages=1403–1414 |doi=10.1109/TIT.2004.830752 }}</ref> Another possible cause for the Zipf distribution is a [[preferential attachment]] process, in which the value {{mvar|x}} of an item tends to grow at a rate proportional to {{mvar|x}} (intuitively, "[[Matthew effect|the rich get richer]]" or "success breeds success"). Such a growth process results in the [[Yule–Simon distribution]], which has been shown to fit word frequency versus rank in language<ref name=linr2014/> and population versus city rank<ref name=vita2015/> better than Zipf's law. It was originally derived to explain population versus rank in species by Yule, and applied to cities by Simon. A similar explanation is based on atlas models, systems of exchangeable positive-valued [[diffusion process]]es with drift and variance parameters that depend only on the rank of the process. It has been shown mathematically that Zipf's law holds for Atlas models that satisfy certain natural regularity conditions.<ref name=fern2020/><ref name=taot2012/>
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Zipf's law
(section)
Add topic