Editing Power law (section)

===Graphical methods for identification===

Although more sophisticated and robust methods have been proposed, the most frequently used graphical methods of identifying power-law probability distributions using random samples are Pareto quantile-quantile plots (or Pareto [[Q–Q plot]]s),{{citation needed|date=May 2012}} mean residual life plots<ref>Beirlant, J., Teugels, J. L., Vynckier, P. (1996) ''Practical Analysis of Extreme Values'', Leuven: Leuven University Press</ref><ref>Coles, S. (2001) ''An introduction to statistical modeling of extreme values''. Springer-Verlag, London.</ref> and [[log–log plot]]s. Another, more robust graphical method uses bundles of residual quantile functions.<ref name=Diaz>{{cite journal | last1 = Diaz |first1=F. J. | year = 1999 | title = Identifying Tail Behavior by Means of Residual Quantile Functions | journal = Journal of Computational and Graphical Statistics | volume = 8 | issue = 3| pages = 493–509 | doi = 10.2307/1390871 |jstor=1390871 }}</ref> (Please keep in mind that power-law distributions are also called Pareto-type distributions.) It is assumed here that a random sample is obtained from a probability distribution, and that we want to know if the tail of the distribution follows a power law (in other words, we want to know if the distribution has a "Pareto tail"). Here, the random sample is called "the data".

==== Pareto Q–Q plots ====
Pareto Q–Q plots compare the [[quantile]]s of the log-transformed data to the corresponding quantiles of an exponential distribution with mean 1 (or to the quantiles of a standard Pareto distribution) by plotting the former versus the latter. If the resultant scatterplot suggests that the plotted points ''asymptotically converge'' to a straight line, then a power-law distribution should be suspected.  A limitation of Pareto Q–Q plots is that they behave poorly when the tail index <math>\alpha</math> (also called Pareto index) is close to 0, because Pareto Q–Q plots are not designed to identify distributions with slowly varying tails.<ref name="Diaz" />

==== Mean residual life plots ====
On the other hand, in its version for identifying power-law probability distributions, the mean residual life plot consists of first log-transforming the data, and then  plotting the average of those log-transformed data that are higher than the ''i''-th order statistic versus the ''i''-th order statistic, for ''i''&nbsp;=&nbsp;1,&nbsp;...,&nbsp;''n'', where n is the size of the random sample. If the resultant scatterplot suggests that the plotted points tend to stabilize about a horizontal straight line, then a power-law distribution should be suspected. Since the mean residual life plot is very sensitive to outliers (it is not robust), it usually produces plots that are difficult to interpret; for this reason, such plots are usually called Hill horror plots.<ref>{{cite journal | last1 = Resnick | first1 = S. I. | year = 1997 | title = Heavy Tail Modeling and Teletraffic Data | journal = The Annals of Statistics | volume = 25 | issue = 5| pages = 1805–1869 | doi=10.1214/aos/1069362376| doi-access = free }}</ref>

==== Log-log plots ====
[[File:Log-log plot example.svg|thumb|A straight line on a log–log plot is necessary but insufficient evidence for power-laws, the slope of the straight line corresponds to the power law exponent.]]

[[Log–log plot]]s are an alternative way of graphically examining the tail of a distribution using a random sample. Taking the logarithm of a power law of the form <math>f(x) = ax^{k}</math> results in:<ref>http://www.physics.pomona.edu/sixideas/old/labs/LRM/LR05.pdf</ref>

:<math>\begin{align}
 \log(f(x)) &= \log(ax^{k}) \\
 &= \log(a) + \log(x^k) \\
 &= \log(a) + k \cdot \log(x),
\end{align}</math>

which forms a straight line with slope <math>k</math> on a log-log scale. Caution has to be exercised however as a log–log plot is necessary but insufficient evidence for a power law relationship, as many non power-law distributions will appear as straight lines on a log–log plot.{{sfn|Clauset|Shalizi|Newman|2009}}<ref>{{cite web|url=http://bactra.org/weblog/491.html|title=So You Think You Have a Power Law — Well Isn't That Special?|website=bactra.org|access-date=27 March 2018}}</ref> This method consists of plotting the logarithm of an estimator of the probability that a particular number of the distribution occurs versus the logarithm of that particular number. Usually, this estimator is the proportion of times that the number occurs in the data set. If the points in the plot tend to converge to a straight line for large numbers in the x axis, then the researcher concludes that the distribution has a power-law tail. Examples of the application of these types of plot have been published.<ref>{{cite journal |last1=Jeong |first1=H. |last2=Tombor |first2= B. Albert |last3=Oltvai |first3=Z.N. |last4=Barabasi |first4= A.-L. |year=2000 |title=The large-scale organization of metabolic networks |journal=Nature |volume=407 |issue=6804| pages=651–654 |doi=10.1038/35036627 |pmid=11034217 |arxiv=cond-mat/0010278 |bibcode=2000Natur.407..651J |s2cid=4426931}}</ref> A disadvantage of these plots is that, in order for them to provide reliable results, they require huge amounts of data. In addition, they are appropriate only for discrete (or grouped) data.

==== Bundle plots ====
Another graphical method for the identification of power-law probability distributions using random samples has been proposed.<ref name="Diaz" /> This methodology consists of plotting a ''bundle for the log-transformed sample''. Originally proposed as a tool to explore the existence of moments and the moment generation function using random samples, the bundle methodology is based on residual [[quantile function]]s (RQFs), also called residual percentile functions,<ref>{{cite journal | last1 = Arnold | first1 = B. C. | last2 = Brockett | first2 = P. L. | year = 1983 | title = When does the βth percentile residual life function determine the distribution? | journal = Operations Research | volume = 31 | issue = 2| pages = 391–396 | doi=10.1287/opre.31.2.391| doi-access =  }}</ref><ref>{{cite journal | last1 = Joe | first1 = H. | last2 = Proschan | first2 = F. | year = 1984 | title = Percentile residual life functions | journal = Operations Research | volume = 32 | issue = 3| pages = 668–678 | doi=10.1287/opre.32.3.668}}</ref><ref>Joe, H. (1985), "Characterizations of life distributions from percentile residual lifetimes", ''Ann. Inst. Statist. Math.'' 37, Part A, 165–172.</ref><ref>{{cite journal | last1 = Csorgo | first1 = S. | last2 = Viharos | first2 = L. | year = 1992 | title = Confidence bands for percentile residual lifetimes | url =https://deepblue.lib.umich.edu/bitstream/2027.42/30190/1/0000575.pdf | journal = Journal of Statistical Planning and Inference | volume = 30 | issue = 3| pages = 327–337 | doi=10.1016/0378-3758(92)90159-p| hdl = 2027.42/30190 | hdl-access = free }}</ref><ref>{{cite journal | last1 = Schmittlein | first1 = D. C. | last2 = Morrison | first2 = D. G. | year = 1981 | title = The median residual lifetime: A characterization theorem and an application | journal = Operations Research | volume = 29 | issue = 2| pages = 392–399 | doi=10.1287/opre.29.2.392}}</ref><ref>{{cite journal | last1 = Morrison | first1 = D. G. | last2 = Schmittlein | first2 = D. C. | year = 1980 | title = Jobs, strikes, and wars: Probability models for duration | journal = Organizational Behavior and Human Performance | volume = 25 | issue = 2| pages = 224–251 | doi=10.1016/0030-5073(80)90065-3}}</ref><ref>{{cite journal | last1 = Gerchak | first1 = Y | year = 1984 | title = Decreasing failure rates and related issues in the social sciences | journal = Operations Research | volume = 32 | issue = 3| pages = 537–546 | doi=10.1287/opre.32.3.537}}</ref> which provide a full characterization of the tail behavior of many well-known probability distributions, including power-law distributions, distributions with other types of heavy tails, and even non-heavy-tailed distributions. Bundle plots do not have the disadvantages of Pareto Q–Q plots, mean residual life plots and log–log plots mentioned above (they are robust to outliers,  allow visually identifying power laws with small values of <math>\alpha</math>, and do not demand the collection of much data).{{citation needed|date=May 2012}} In addition, other types of tail behavior can be identified using bundle plots.