Editing Pareto distribution (section)

==Occurrence and applications==
===General===
[[Vilfredo Pareto]] originally used this distribution to describe the [[Distribution of wealth|allocation of wealth]] among individuals since it seemed to show rather well the way that a larger portion of the wealth of any society is owned by a smaller percentage of the people in that society. He also used it to describe distribution of income.<ref name=":1">Pareto, Vilfredo, ''Cours d'Économie Politique: Nouvelle édition par G.-H. Bousquet et G. Busino'', Librairie Droz, Geneva, 1964, pp. 299–345. [https://web.archive.org/web/20130531151249/http://www.institutcoppet.org/wp-content/uploads/2012/05/Cours-d%C3%A9conomie-politique-Tome-II-Vilfredo-Pareto.pdf Original book archived]</ref> This idea is sometimes expressed more simply as the [[Pareto principle]] or the "80-20 rule" which says that 20% of the population controls 80% of the wealth.<ref>For a two-quantile population, where approximately 18% of the population owns 82% of the wealth, the [[Theil index]] takes the value 1.</ref> As Michael Hudson points out (''The Collapse of Antiquity'' [2023] p. 85 & n.7) "a mathematical corollary [is] that 10% would have 65% of the wealth, and 5% would have half the national wealth.” However, the 80-20 rule corresponds to a particular value of ''α'', and in fact, Pareto's data on British income taxes in his ''Cours d'économie politique'' indicates that about 30% of the population had about 70% of the income.{{citation needed|date=May 2019}} The [[probability density function]] (PDF) graph at the beginning of this article shows that the "probability" or fraction of the population that owns a small amount of wealth per person is rather high, and then decreases steadily as wealth increases. (The Pareto distribution is not realistic for wealth for the lower end, however. In fact, [[net worth]] may even be negative.) This distribution is not limited to describing wealth or income, but to many situations in which an equilibrium is found in the distribution of the "small" to the "large". The following examples are sometimes seen as approximately Pareto-distributed:
<!-- THESE TWO SEEM TO BELONG UNDER [[Zipf's law]] RATHER THAN THE PARETO DISTRIBUTION
* Frequencies of words in longer texts (a few words are used often, lots of words are used infrequently)
* Frequencies of [[Given name#Popularity distribution of given names|given names]] -->
* All four variables of the household's budget constraint: consumption, labor income, capital income, and wealth.<ref>{{cite web |ssrn=4636704 |last1=Gaillard |first1=Alexandre |last2=Hellwig |first2=Christian |last3=Wangner | first3=Philipp |last4=Werquin |first4=Nicolas |title=Consumption, Wealth, and Income Inequality: A Tale of Tails |date=2023  |doi=10.2139/ssrn.4636704 | url=https://ssrn.com/abstract=4636704 }}</ref>
* The sizes of human settlements (few cities, many hamlets/villages)<ref name="Reed">{{cite journal |citeseerx=10.1.1.70.4555 |first=William J. |last=Reed |title=The Double Pareto-Lognormal Distribution – A New Parametric Model for Size Distributions |journal=Communications in Statistics – Theory and Methods |volume=33 |issue=8 |pages=1733–53 |year=2004 |doi=10.1081/sta-120037438|s2cid=13906086 |display-authors=etal}}</ref><ref name="Reed2002">{{cite journal |first=William J. |last=Reed |title=On the rank-size distribution for human settlements |journal=Journal of Regional Science |volume=42 |issue=1 |pages=1–17 |year=2002 |doi=10.1111/1467-9787.00247|bibcode=2002JRegS..42....1R |s2cid=154285730 }}</ref>
* File size distribution of Internet traffic which uses the TCP protocol (many smaller files, few larger ones)<ref name ="Reed" />
* [[Hard disk drive]] error rates<ref>{{cite journal |title=Understanding latent sector error and how to protect against them |url=http://www.usenix.org/event/fast10/tech/full_papers/schroeder.pdf |first1=Bianca |last1=Schroeder |author1-link= Bianca Schroeder |first2=Sotirios |last2=Damouras |first3=Phillipa |last3=Gill |journal=8th Usenix Conference on File and Storage Technologies (FAST 2010)| date=2010-02-24 |access-date=2010-09-10 |quote=We experimented with 5 different distributions (Geometric, Weibull, Rayleigh, Pareto, and Lognormal), that are commonly used in the context of system reliability, and evaluated their fit through the total squared differences between the actual and hypothesized frequencies (χ<sup>2</sup> statistic). We found consistently across all models that the geometric distribution is a poor fit, while the Pareto distribution provides the best fit.}}</ref>
* Clusters of [[Bose–Einstein condensate]] near [[absolute zero]]<ref name="Simon">{{cite journal|first2=Herbert A.|last2=Simon|author=Yuji Ijiri |title=Some Distributions Associated with Bose–Einstein Statistics|journal=Proc. Natl. Acad. Sci. USA|date=May 1975|volume=72|issue=5|pages=1654–57|pmc=432601|pmid=16578724|doi=10.1073/pnas.72.5.1654|bibcode=1975PNAS...72.1654I|doi-access=free}}</ref>
[[File:FitParetoDistr.tif|thumb|250px|Fitted cumulative Pareto (Lomax) distribution to maximum one-day rainfalls using [[CumFreq]], see also [[distribution fitting]] ]]
* The values of [[oil reserves]] in oil fields (a few [[Giant oil and gas fields|large fields]], many [[Stripper well|small fields]])<ref name ="Reed" />
* The length distribution in jobs assigned to supercomputers (a few large ones, many small ones)<ref>{{Cite journal|last1=Harchol-Balter|first1=Mor|author1-link=Mor Harchol-Balter|last2=Downey|first2=Allen|date=August 1997|title=Exploiting Process Lifetime Distributions for Dynamic Load Balancing|url=https://users.soe.ucsc.edu/~scott/courses/Fall11/221/Papers/Sync/harcholbalter-tocs97.pdf|journal=ACM Transactions on Computer Systems|volume=15|issue=3|pages=253–258|doi=10.1145/263326.263344|s2cid=52861447}}</ref>
* The standardized price returns on individual stocks <ref name="Reed" />
* Sizes of sand particles <ref name ="Reed" />
* The size of meteorites
* Severity of large [[casualty (person)|casualty]] losses for certain lines of business such as general liability, commercial auto, and workers compensation.<ref>Kleiber and Kotz (2003): p. 94.</ref><ref>{{cite journal |last1=Seal |first1=H. |year=1980 |title=Survival probabilities based on Pareto claim distributions |journal=ASTIN Bulletin |volume=11 |pages=61–71|doi=10.1017/S0515036100006620 |doi-access=free }}</ref>
* Amount of time a user on [[Steam (service)|Steam]] will spend playing different games. (Some games get played a lot, but most get played  almost never.) [https://docs.google.com/spreadsheets/d/1AjtfgTQc1T84NCyJWGcCPN4jrVsOpX0bp0jgPZJEW6A/edit#gid=0]{{Original research inline|date=December 2020}}
* In [[hydrology]] the Pareto distribution is applied to extreme events such as annually maximum one-day rainfalls and river discharges.<ref>CumFreq, software for cumulative frequency analysis and probability distribution fitting [https://www.waterlog.info/cumfreq.htm]</ref> The blue picture illustrates an example of fitting the Pareto distribution to ranked annually maximum one-day rainfalls showing also the 90% [[confidence belt]] based on the [[binomial distribution]]. The rainfall data are represented by [[plotting position]]s as part of the [[cumulative frequency analysis]].
* In Electric Utility Distribution Reliability (80% of the Customer Minutes Interrupted occur on approximately 20% of the days in a given year).

===Relation to Zipf's law===
The Pareto distribution is a continuous probability distribution. [[Zipf's law]], also sometimes called the [[zeta distribution]], is a discrete distribution, separating the values into a simple ranking. Both are a simple power law with a negative exponent, scaled so that their cumulative distributions equal 1.  Zipf's can be derived from the Pareto distribution if the <math>x</math> values (incomes) are binned into <math>N</math> ranks so that the number of people in each bin follows a 1/rank pattern. The distribution is normalized by defining <math>x_m</math> so that <math>\alpha x_\mathrm{m}^\alpha = \frac{1}{H(N,\alpha-1)}</math> where <math>H(N,\alpha-1)</math> is the [[Harmonic number#Generalized harmonic numbers|generalized harmonic number]]. This makes Zipf's probability density function derivable from Pareto's.

: <math>f(x) = \frac{\alpha x_\mathrm{m}^\alpha}{x^{\alpha+1}} = \frac{1}{x^s H(N,s)}</math>

where  <math>s = \alpha-1</math> and <math>x</math> is an integer representing rank from 1 to N where N is the highest income bracket.  So a randomly selected person (or word, website link, or city) from a population (or language, internet, or country) has <math>f(x)</math> probability of ranking <math>x</math>.

===Relation to the "Pareto principle"===
The "[[Pareto principle|80/20 law]]", according to which 20% of all people receive 80% of all income, and 20% of the most affluent 20% receive 80% of that 80%, and so on, holds precisely when the Pareto index is <math>\alpha = \log_4 5 = \cfrac{\log_{10} 5}{\log_{10} 4} \approx 1.161</math>. This result can be derived from the [[Lorenz curve]] formula given below. Moreover, the following have been shown<ref>{{cite journal |last1=Hardy |first1=Michael |year=2010 |title=Pareto's Law |journal=[[Mathematical Intelligencer]] |volume=32 |issue=3 |pages=38–43 |doi=10.1007/s00283-010-9159-2|s2cid=121797873 }}</ref> to be mathematically equivalent:
* Income is distributed according to a Pareto distribution with index ''α''&nbsp;>&nbsp;1.
* There is some number 0&nbsp;≤&nbsp;''p''&nbsp;≤&nbsp;1/2 such that 100''p'' % of all people receive 100(1&nbsp;−&nbsp;''p'')% of all income, and similarly for every real (not necessarily integer) ''n''&nbsp;>&nbsp;0, 100''p<sup>n</sup>'' % of all people receive 100(1&nbsp;−&nbsp;''p'')<sup>''n''</sup> percentage of all income. ''α'' and ''p'' are related by
:: <math>1-\frac{1}{\alpha}=\frac{\ln(1-p)}{\ln(p)}=\frac{\ln((1-p)^n)}{\ln(p^n)}</math>

This does not apply only to income, but also to wealth, or to anything else that can be modeled by this distribution.

This excludes Pareto distributions in which&nbsp;0&nbsp;<&nbsp;''α''&nbsp;≤&nbsp;1, which, as noted above, have an infinite expected value, and so cannot reasonably model income distribution.

===Relation to Price's law===
[[Price's law]] is sometimes offered as a property of or as similar to the Pareto distribution. However, the law only holds in the case that <math>\alpha=1</math>. Note that in this case, the total and expected amount of wealth are not defined, and the rule only applies asymptotically to random samples. The extended Pareto Principle mentioned above is a far more general rule.

===Lorenz curve and Gini coefficient===
[[File:ParetoLorenzSVG.svg|thumb|325px|Lorenz curves for a number of Pareto distributions. The case ''α''&nbsp;=&nbsp;∞ corresponds to perfectly equal distribution (''G''&nbsp;=&nbsp;0) and the ''α''&nbsp;=&nbsp;1 line corresponds to complete inequality (''G''&nbsp;=&nbsp;1)]]

The [[Lorenz curve]] is often used to characterize income and wealth distributions. For any distribution, the Lorenz curve ''L''(''F'') is written in terms of the PDF ''f'' or the CDF ''F'' as

:<math>L(F)=\frac{\int_{x_\mathrm{m}}^{x(F)}xf(x)\,dx}{\int_{x_\mathrm{m}}^\infty xf(x)\,dx} =\frac{\int_0^F x(F')\,dF'}{\int_0^1 x(F')\,dF'}</math>

where ''x''(''F'') is the inverse of the CDF. For the Pareto distribution,

:<math>x(F)=\frac{x_\mathrm{m}}{(1-F)^{\frac{1}{\alpha}}}</math>

and the Lorenz curve is calculated to be

:<math>L(F) = 1-(1-F)^{1-\frac{1}{\alpha}},</math>

For <math>0<\alpha\le 1</math> the denominator is infinite, yielding ''L''=0. Examples of the Lorenz curve for a number of Pareto distributions are shown in the graph on the right.

According to [[Oxfam]] (2016) the richest 62 people have as much wealth as the poorest half of the world's population.<ref>{{cite web|title=62 people own the same as half the world, reveals Oxfam Davos report|url=https://www.oxfam.org/en/pressroom/pressreleases/2016-01-18/62-people-own-same-half-world-reveals-oxfam-davos-report|publisher=Oxfam|date=Jan 2016}}</ref> We can estimate the Pareto index that would apply to this situation. Letting ε equal <math>62/(7\times 10^9)</math> we have:
:<math>L(1/2)=1-L(1-\varepsilon)</math>
or
:<math>1-(1/2)^{1-\frac{1}{\alpha}}=\varepsilon^{1-\frac{1}{\alpha}}</math>
<!--:<math>\ln(1-(1/2)^{1-\frac{1}{\alpha}})=(1-\frac{1}{\alpha})\ln\varepsilon</math>
:<math>\ln(1-(1/2)^{1-\frac{1}{\alpha}})=(\ln\varepsilon/\ln 2)(1-\frac{1}{\alpha})\ln 2</math>
:<math>\ln(1-(1/2)^{1-\frac{1}{\alpha}})=-(\ln\varepsilon/\ln 2)\ln((1/2)^{1-\frac{1}{\alpha}})</math>
:<math>\ln(1-(1/2)^{1-\frac{1}{\alpha}})\approx(\ln\varepsilon/\ln 2)(1-(1/2)^{1-\frac{1}{\alpha}})</math>
:<math>-\ln(1-(1/2)^{1-\frac{1}{\alpha}})\exp(-\ln(1-(1/2)^{1-\frac{1}{\alpha}}))\approx -\ln\varepsilon/\ln 2</math>
:<math>-\ln(1-(1/2)^{1-\frac{1}{\alpha}})\approx W(-\ln\varepsilon/\ln 2)</math>
where ''W'' is the [[Lambert W function]]. So
:<math>(1/2)^{1-\frac{1}{\alpha}}\approx 1-\exp(-W(-\ln\varepsilon/\ln 2))</math>
:<math>{1-\frac{1}{\alpha}}\approx -\ln(1-\exp(-W(-\ln\varepsilon/\ln 2)))/\ln 2</math>
:<math>\alpha\approx 1/(1+\ln(1-\exp(-W(-\ln\varepsilon/\ln 2)))/\ln 2)</math>
-->The solution is that ''α'' equals about 1.15, and about 9% of the wealth is owned by each of the two groups. But actually the poorest 69% of the world adult population owns only about 3% of the wealth.<ref>{{cite web|title=Global Wealth Report 2013|url=https://publications.credit-suisse.com/tasks/render/file/?fileID=BCDB1364-A105-0560-1332EC9100FF5C83|publisher=Credit Suisse|page=22|date=Oct 2013|access-date=2016-01-24|archive-url=https://web.archive.org/web/20150214155424/https://publications.credit-suisse.com/tasks/render/file/?fileID=BCDB1364-A105-0560-1332EC9100FF5C83|archive-date=2015-02-14|url-status=dead}}</ref>

The [[Gini coefficient]] is a measure of the deviation of the Lorenz curve from the equidistribution line which is a line connecting [0,&nbsp;0] and [1,&nbsp;1], which is shown in black (''α''&nbsp;=&nbsp;∞) in the Lorenz plot on the right. Specifically, the Gini coefficient is twice the area between the Lorenz curve and the equidistribution line. The Gini coefficient for the Pareto distribution is then calculated (for <math>\alpha\ge 1</math>) to be

:<math>G = 1-2 \left (\int_0^1L(F) \, dF \right ) = \frac{1}{2\alpha-1}</math>

(see Aaberge 2005).