Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Benford's law
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Explanations== Benford's law tends to apply most accurately to data that span several [[orders of magnitude]]. As a rule of thumb, the more orders of magnitude that the data evenly covers, the more accurately Benford's law applies. For instance, one can expect that Benford's law would apply to a list of numbers representing the populations of United Kingdom settlements. But if a "settlement" is defined as a village with population between 300 and 999, then Benford's law will not apply.<ref name="dspguide">{{cite book |chapter-url=http://www.dspguide.com/ch34.htm |title=The Scientist and Engineer's Guide to Digital Signal Processing |section=Chapter 34: Explaining Benford's Law. The Power of Signal Processing |author=Steven W. Smith |access-date=15 December 2012}}</ref><ref name="fewster">{{Cite journal|first=R. M. |last=Fewster |author-link=Rachel Fewster|s2cid=39595550 |title=A simple explanation of Benford's Law |journal=The American Statistician |year=2009 |volume=63 |issue=1 |pages=26–32 |doi=10.1198/tast.2009.0005 |url=https://www.stat.auckland.ac.nz/~fewster/RFewster_Benford.pdf |archive-url=https://ghostarchive.org/archive/20221009/https://www.stat.auckland.ac.nz/~fewster/RFewster_Benford.pdf |archive-date=2022-10-09 |url-status=live |citeseerx=10.1.1.572.6719}}</ref> Consider the probability distributions shown below, referenced to a [[log scale]]. In each case, the total area in red is the relative probability that the first digit is 1, and the total area in blue is the relative probability that the first digit is 8. For the first distribution, the size of the areas of red and blue are approximately proportional to the widths of each red and blue bar. Therefore, the numbers drawn from this distribution will approximately follow Benford's law. On the other hand, for the second distribution, the ratio of the areas of red and blue is very different from the ratio of the widths of each red and blue bar. Rather, the relative areas of red and blue are determined more by the heights of the bars than the widths. Accordingly, the first digits in this distribution do not satisfy Benford's law at all.<ref name=fewster /> {| |[[File:BenfordBroad.png|thumb|left|300px|A broad probability distribution of the log of a variable, shown on a log scale. Benford's law can be seen in the larger area covered by red (first digit one) compared to blue (first digit 8) shading.]] | [[File:BenfordNarrow.gif|thumb|left|300px|A narrow probability distribution of the log of a variable, shown on a log scale. Benford's law is not followed, because the narrow distribution does not meet the criteria for Benford's law.]] |} Thus, real-world distributions that span several [[orders of magnitude]] rather uniformly (e.g., stock-market prices and populations of villages, towns, and cities) are likely to satisfy Benford's law very accurately. On the other hand, a distribution mostly or entirely within one order of magnitude (e.g., [[IQ score]]s or heights of human adults) is unlikely to satisfy Benford's law very accurately, if at all.<ref name=dspguide /><ref name=fewster /> However, the difference between applicable and inapplicable regimes is not a sharp cut-off: as the distribution gets narrower, the deviations from Benford's law increase gradually. (This discussion is not a full explanation of Benford's law, because it has not explained why data sets are so often encountered that, when plotted as a probability distribution of the logarithm of the variable, are relatively uniform over several orders of magnitude.<ref name=BergerHillExplain>Arno Berger and Theodore P. Hill, [http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1074&context=rgp_rsr Benford's Law Strikes Back: No Simple Explanation in Sight for Mathematical Gem], 2011. The authors describe this argument but say it "still leaves open the question of why it is reasonable to assume that the logarithm of the spread, as opposed to the spread itself—or, say, the log log spread—should be large" and that "assuming large spread on a logarithmic scale is ''equivalent'' to assuming an approximate conformance with [Benford's law]" (italics added), something which they say lacks a "simple explanation".</ref>) ===Krieger–Kafri entropy explanation=== In 1970 [[Wolfgang Krieger]] proved what is now called the Krieger generator theorem.<ref name="Krieger1970">{{cite journal |last1=Krieger |first1=Wolfgang |title=On entropy and generators of measure-preserving transformations |journal=Transactions of the American Mathematical Society |volume=149 |issue=2 |year=1970 |pages=453 |issn=0002-9947 |doi=10.1090/S0002-9947-1970-0259068-3 |doi-access=free}}</ref><ref name="Downarowicz2011">{{cite book |author=Downarowicz, Tomasz |title=Entropy in Dynamical Systems |url=https://books.google.com/books?id=avUGMc787v8C&pg=PA106 |date=12 May 2011 |publisher=Cambridge University Press |isbn=978-1-139-50087-6 |page=106}}</ref> The Krieger generator theorem might be viewed as a justification for the assumption in the Kafri ball-and-box model that, in a given base <math>B</math> with a fixed number of digits 0, 1, ..., ''n'', ..., <math>B - 1</math>, digit ''n'' is equivalent to a Kafri box containing ''n'' non-interacting balls. Other scientists and statisticians have suggested entropy-related explanations{{which|date=October 2023}} for Benford's law.<ref>{{cite book |author=Smorodinsky, Meir |year=1971 |chapter=Chapter IX. Entropy and generators. Krieger's theorem |title=Ergodic Theory, Entropy |series=Lecture Notes in Mathematics |volume=214 |pages=54–57 |publisher=Springer |location=Berlin, Heidelberg |doi=10.1007/BFb0066096|isbn=978-3-540-05556-3 }}</ref><ref name="Jolion2001">{{cite journal |last1=Jolion |first1=Jean-Michel |title=Images and Benford's Law |journal=Journal of Mathematical Imaging and Vision |volume=14 |issue=1 |year=2001 |pages=73–81 |issn=0924-9907 |doi=10.1023/A:1008363415314 |bibcode=2001JMIV...14...73J |s2cid=34151059}}</ref><ref name="Miller2015">{{cite book |editor=Miller, Steven J. |editor-link=Steven J. Miller |title=Benford's Law: Theory and Applications |url=https://books.google.com/books?id=J_NnBgAAQBAJ&pg=309 |page=309 |date=9 June 2015 |publisher=Princeton University Press |isbn=978-1-4008-6659-5}}</ref><ref name="Lemons2019">{{cite journal |last1=Lemons |first1=Don S. |title=Thermodynamics of Benford's first digit law |journal=American Journal of Physics |volume=87 |issue=10 |year=2019 |pages=787–790 |issn=0002-9505 |doi=10.1119/1.5116005 |arxiv=1604.05715 |bibcode=2019AmJPh..87..787L |s2cid=119207367}}</ref> ===Multiplicative fluctuations=== Many real-world examples of Benford's law arise from multiplicative fluctuations.<ref name=Pietronero>{{cite journal |title=Explaining the uneven distribution of numbers in nature: the laws of Benford and Zipf |author=L. Pietronero |author2=E. Tosatti |author3=V. Tosatti |author4=A. Vespignani |journal=Physica A |year=2001 |volume=293 |issue=1–2 |pages=297–304 |doi=10.1016/S0378-4371(00)00633-6 |bibcode = 2001PhyA..293..297P |arxiv=cond-mat/9808305}}</ref> For example, if a stock price starts at $100, and then each day it gets multiplied by a randomly chosen factor between 0.99 and 1.01, then over an extended period the probability distribution of its price satisfies Benford's law with higher and higher accuracy. The reason is that the ''logarithm'' of the stock price is undergoing a [[random walk]], so over time its probability distribution will get more and more broad and smooth (see [[#Overview|above]]).<ref name=Pietronero/> (More technically, the [[central limit theorem]] says that multiplying more and more random variables will create a [[log-normal distribution]] with larger and larger variance, so eventually it covers many orders of magnitude almost uniformly.) To be sure of approximate agreement with Benford's law, the distribution has to be approximately invariant when scaled up by any factor up to 10; a [[log-normal]]ly distributed data set with wide dispersion would have this approximate property. Unlike multiplicative fluctuations, ''additive'' fluctuations do not lead to Benford's law: They lead instead to [[normal probability distribution]]s (again by the [[central limit theorem]]), which do not satisfy Benford's law. By contrast, that hypothetical stock price described above can be written as the ''product'' of many random variables (i.e. the price change factor for each day), so is ''likely'' to follow Benford's law quite well. ===Multiple probability distributions=== [[Anton Formann]] provided an alternative explanation by directing attention to the interrelation between the [[Probability distribution|distribution]] of the significant digits and the distribution of the [[dependent variable|observed variable]]. He showed in a simulation study that long-right-tailed distributions of a [[random variable]] are compatible with the Newcomb–Benford law, and that for distributions of the ratio of two random variables the fit generally improves.<ref>{{cite journal |last1=Formann |first1=A. K. |year=2010 |title=The Newcomb–Benford law in its relation to some common distributions |journal=PLOS ONE |volume=5 |issue=5 |page=e10541 |doi=10.1371/journal.pone.0010541 |pmid=20479878 |pmc=2866333 |bibcode=2010PLoSO...510541F |doi-access=free}}</ref> For numbers drawn from certain distributions ([[IQ score]]s, human heights) the Benford's law fails to hold because these variates obey a normal distribution, which is known not to satisfy Benford's law,<ref name=Formann2010>{{Cite journal | last1 = Formann | first1 = A. K. | title = The Newcomb–Benford Law in Its Relation to Some Common Distributions | doi = 10.1371/journal.pone.0010541 | journal = PLOS ONE | volume = 5 | issue = 5 | pages = e10541 | year = 2010 | pmid = 20479878 | pmc = 2866333 | editor1-last = Morris | editor1-first = Richard James | bibcode = 2010PLoSO...510541F | doi-access = free }}</ref> since normal distributions can't span several orders of magnitude and the [[Significand]] of their logarithms will not be (even approximately) uniformly distributed. However, if one "mixes" numbers from those distributions, for example, by taking numbers from newspaper articles, Benford's law reappears. This can also be proven mathematically: if one repeatedly "randomly" chooses a [[probability distribution]] (from an uncorrelated set) and then randomly chooses a number according to that distribution, the resulting list of numbers will obey Benford's law.<ref name=Hill1995>{{Cite journal | author = Theodore P. Hill | author-link = Theodore P. Hill | title = A Statistical Derivation of the Significant-Digit Law | journal = Statistical Science | volume = 10 | issue = 4 | pages = 354–363 | year = 1995 | url = http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1042&context=rgp_rsr | doi = 10.1214/ss/1177009869 | mr = 1421567 | doi-access = free }}</ref><ref name=Hill1998>{{Cite journal | author = Theodore P. Hill | author-link = Theodore P. Hill | title = The first digit phenomenon | journal = [[American Scientist]] | volume = 86 | date = July–August 1998 | page = 358 | url = http://people.math.gatech.edu/~hill/publications/PAPER%20PDFS/TheFirstDigitPhenomenonAmericanScientist1996.pdf | bibcode = 1998AmSci..86..358H | doi = 10.1511/1998.4.358 | issue = 4| s2cid = 13553246 }}</ref> A similar probabilistic explanation for the appearance of Benford's law in everyday-life numbers has been advanced by showing that it arises naturally when one considers mixtures of uniform distributions.<ref>{{cite journal | last1 = Janvresse | first1 = Élise | last2 = Thierry | year = 2004 | title = From Uniform Distributions to Benford's Law | url = http://lmrs.univ-rouen.fr/Persopage/Delarue/Publis/PDF/uniform_distribution_to_Benford_law.pdf | journal = Journal of Applied Probability | volume = 41 | issue = 4 | pages = 1203–1210 | doi = 10.1239/jap/1101840566 | mr = 2122815 | access-date = 13 August 2015 | archive-url = https://web.archive.org/web/20160304125725/http://lmrs.univ-rouen.fr/Persopage/Delarue/Publis/PDF/uniform_distribution_to_Benford_law.pdf | archive-date = 4 March 2016 | url-status = dead }}</ref> ===Invariance=== In a list of lengths, the distribution of first digits of numbers in the list may be generally similar regardless of whether all the lengths are expressed in metres, yards, feet, inches, etc. The same applies to monetary units. This is not always the case. For example, the height of adult humans almost always starts with a 1 or 2 when measured in metres and almost always starts with 4, 5, 6, or 7 when measured in feet. But in a list of lengths spread evenly over many orders of magnitude—for example, a list of 1000 lengths mentioned in scientific papers that includes the measurements of molecules, bacteria, plants, and galaxies—it is reasonable to expect the distribution of first digits to be the same no matter whether the lengths are written in metres or in feet. When the distribution of the first digits of a data set is [[scale-invariant]] (independent of the units that the data are expressed in), it is always given by Benford's law.<ref name=Pinkham>{{cite journal | last1 = Pinkham | first1 = Roger S. | year = 1961 | title = On the Distribution of First Significant Digits | url = http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aoms/1177704862 | journal = Ann. Math. Statist. | volume = 32 | issue = 4 | pages = 1223–1230 | doi = 10.1214/aoms/1177704862 | doi-access = free }}</ref><ref name="wolfram">{{Cite web |url=https://mathworld.wolfram.com/BenfordsLaw.html |title=Benford's Law |first=Eric W. |last=Weisstein |website=mathworld.wolfram.com}}</ref> For example, the first (non-zero) digit on the aforementioned list of lengths should have the same distribution whether the unit of measurement is feet or yards. But there are three feet in a yard, so the probability that the first digit of a length in yards is 1 must be the same as the probability that the first digit of a length in feet is 3, 4, or 5; similarly, the probability that the first digit of a length in yards is 2 must be the same as the probability that the first digit of a length in feet is 6, 7, or 8. Applying this to all possible measurement scales gives the logarithmic distribution of Benford's law. Benford's law for first digits is [[radix|base]] invariant for number systems. There are conditions and proofs of sum invariance, inverse invariance, and addition and subtraction invariance.<ref>{{Cite web |last=Jamain |first=Adrien |date=September 2001 |title=Benford's Law |url=https://wwwf.imperial.ac.uk/~nadams/classificationgroup/Benfords-Law.pdf |archive-url=https://ghostarchive.org/archive/20221009/https://wwwf.imperial.ac.uk/~nadams/classificationgroup/Benfords-Law.pdf |archive-date=2022-10-09 |url-status=live |access-date=2020-11-15 |website=Imperial College of London}}</ref><ref>{{Cite journal |last=Berger |first=Arno |date=June 2011 |title=A basic theory of Benford's Law |url=https://projecteuclid.org/download/pdfview_1/euclid.ps/1311860830 |journal=Probability Surveys |volume=8 (2011) |pages=1–126}}</ref>
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Benford's law
(section)
Add topic