Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Order statistic
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Probabilistic analysis == Given any random variables ''X''<sub>1</sub>, ''X''<sub>2</sub>, ..., ''X''<sub>''n''</sub>, the order statistics X<sub>(1)</sub>, X<sub>(2)</sub>, ..., X<sub>(''n'')</sub> are also random variables, defined by sorting the values ([[realization (probability)|realizations]]) of ''X''<sub>1</sub>, ..., ''X''<sub>''n''</sub> in increasing order. When the random variables ''X''<sub>1</sub>, ''X''<sub>2</sub>, ..., ''X''<sub>''n''</sub> form a [[sample (statistics)|sample]] they are [[independent and identically distributed]]. This is the case treated below. In general, the random variables ''X''<sub>1</sub>, ..., ''X''<sub>''n''</sub> can arise by sampling from more than one population. Then they are [[independent (statistics)|independent]], but not necessarily identically distributed, and their [[joint probability distribution]] is given by the [[Bapat–Beg theorem]]. From now on, we will assume that the random variables under consideration are [[continuous probability distribution|continuous]] and, where convenient, we will also assume that they have a [[probability density function]] (PDF), that is, they are [[absolute continuity|absolutely continuous]]. The peculiarities of the analysis of distributions assigning mass to points (in particular, [[discrete distribution]]s) are discussed at the end. === Cumulative distribution function of order statistics === For a random sample as above, with cumulative distribution <math>F_X(x)</math>, the order statistics for that sample have cumulative distributions as follows<ref>{{cite book |last1=Casella |first1=George |last2=Berger |first2=Roger |title=Statistical Inference |year=2002 |publisher=Cengage Learning |isbn=9788131503942 |page=229 |edition=2nd |url={{Google books|0x_vAAAAMAAJ|page=228|plainurl=yes}} }}</ref> (where ''r'' specifies which order statistic): <math display="block"> F_{X_{(r)}}(x) = \sum_{j=r}^{n} \binom{n}{j} [ F_{X}(x) ]^{j} [ 1 - F_{X}(x) ]^{n-j} </math> The proof of this formula is pure [[combinatorics]]: for the <math>r</math>th order statistic to be <math> \leq x </math>, the number of samples that are <math> > x </math> has to be between <math> 0 </math> and <math> n-r </math>. In the case that <math> X_{(j)} </math> is the largest order statistic <math> \leq x </math>, there has to be <math> j </math> samples <math> \leq x </math> (each with an independent probability of <math> F_X(x) </math>) and <math> n-j </math> samples <math> >x </math> (each with an independent probability of <math> 1 - F_X(x) </math>). Finally there are <math> \textstyle \binom{n}{j} </math> different ways of choosing which of the <math> n </math> samples are of the <math> \leq x </math> kind. The corresponding probability density function may be derived from this result, and is found to be :<math>f_{X_{(r)}}(x) = \frac{n!}{(r-1)!(n-r)!} f_{X}(x) [ F_{X}(x) ]^{r-1} [ 1 - F_{X}(x) ]^{n-r}.</math> Moreover, there are two special cases, which have CDFs that are easy to compute. :<math>F_{X_{(n)}}(x) = \operatorname{Prob}(\max\{\,X_1,\ldots,X_n\,\} \leq x) = [ F_{X}(x) ]^n</math> :<math>F_{X_{(1)}}(x) = \operatorname{Prob}(\min\{\,X_1,\ldots,X_n\,\} \leq x) = 1- [ 1 - F_{X}(x) ]^n</math> Which can be derived by careful consideration of probabilities. === Probability distributions of order statistics === ==== Order statistics sampled from a uniform distribution ==== In this section we show that the order statistics of the [[uniform distribution (continuous)|uniform distribution]] on the [[unit interval]] have [[marginal distribution]]s belonging to the [[beta distribution]] family. We also give a simple method to derive the joint distribution of any number of order statistics, and finally translate these results to arbitrary continuous distributions using the [[cumulative distribution function|cdf]]. We assume throughout this section that <math>X_1, X_2, \ldots, X_n</math> is a [[random sample]] drawn from a continuous distribution with cdf <math>F_X</math>. Denoting <math>U_i=F_X(X_i)</math> we obtain the corresponding random sample <math>U_1,\ldots,U_n</math> from the standard [[uniform distribution (continuous)|uniform distribution]]. Note that the order statistics also satisfy <math>U_{(i)}=F_X(X_{(i)})</math>. The probability density function of the order statistic <math>U_{(k)}</math> is equal to<ref name="gentle">{{citation|title=Computational Statistics|first=James E.|last=Gentle|publisher=Springer|year=2009|isbn=9780387981444|page=63|url=https://books.google.com/books?id=mQ5KAAAAQBAJ&pg=PA63}}.</ref> :<math>f_{U_{(k)}}(u)={n!\over (k-1)!(n-k)!}u^{k-1}(1-u)^{n-k}</math> that is, the ''k''th order statistic of the uniform distribution is a [[beta distribution|beta-distributed]] random variable.<ref name="gentle"/><ref>{{citation|title=Kumaraswamy's distribution: A beta-type distribution with some tractability advantages|first=M. C.|last=Jones|journal=Statistical Methodology|volume=6|issue=1|year=2009|pages=70–81|doi=10.1016/j.stamet.2008.04.001|quote=As is well known, the beta distribution is the distribution of the ''m'' ’th order statistic from a random sample of size ''n'' from the uniform distribution (on (0,1)).}}</ref> :<math>U_{(k)} \sim \operatorname{Beta}(k,n+1\mathbf{-}k).</math> The proof of these statements is as follows. For <math>U_{(k)}</math> to be between ''u'' and ''u'' + ''du'', it is necessary that exactly ''k'' − 1 elements of the sample are smaller than ''u'', and that at least one is between ''u'' and ''u'' + d''u''. The probability that more than one is in this latter interval is already <math>O(du^2)</math>, so we have to calculate the probability that exactly ''k'' − 1, 1 and ''n'' − ''k'' observations fall in the intervals <math>(0,u)</math>, <math>(u,u+du)</math> and <math>(u+du,1)</math> respectively. This equals (refer to [[multinomial distribution]] for details) :<math>{n!\over (k-1)!(n-k)!}u^{k-1}\cdot du\cdot(1-u-du)^{n-k}</math> and the result follows. The mean of this distribution is ''k'' / (''n'' + 1). ==== The joint distribution of the order statistics of the uniform distribution ==== Similarly, for ''i'' < ''j'', the [[joint probability distribution|joint probability density function]] of the two order statistics ''U''<sub>(''i'')</sub> < ''U''<sub>(''j'')</sub> can be shown to be :<math>f_{U_{(i)},U_{(j)}}(u,v) = n!{u^{i-1}\over (i-1)!}{(v-u)^{j-i-1}\over(j-i-1)!}{(1-v)^{n-j}\over (n-j)!}</math> which is (up to terms of higher order than <math>O(du\,dv)</math>) the probability that ''i'' − 1, 1, ''j'' − 1 − ''i'', 1 and ''n'' − ''j'' sample elements fall in the intervals <math>(0,u)</math>, <math>(u,u+du)</math>, <math>(u+du,v)</math>, <math>(v,v+dv)</math>, <math>(v+dv,1)</math> respectively. One reasons in an entirely analogous way to derive the higher-order joint distributions. Perhaps surprisingly, the joint density of the ''n'' order statistics turns out to be ''constant'': :<math>f_{U_{(1)},U_{(2)},\ldots,U_{(n)}}(u_{1},u_{2},\ldots,u_{n}) = n!.</math> One way to understand this is that the unordered sample does have constant density equal to 1, and that there are ''n''! different permutations of the sample corresponding to the same sequence of order statistics. This is related to the fact that 1/''n''! is the volume of the region <math>0<u_1<\cdots<u_n<1</math>. It is also related with another particularity of order statistics of uniform random variables: It follows from the [[BRS-inequality]] that the maximum expected number of uniform U(0,1] random variables one can choose from a sample of size n with a sum up not exceeding <math>0 <s <n/2</math> is bounded above by <math> \sqrt{2sn} </math>, which is thus invariant on the set of all <math> s, n </math> with constant product <math> s n </math>. Using the above formulas, one can derive the distribution of the range of the order statistics, that is the distribution of <math>U_{(n)}-U_{(1)}</math>, i.e. maximum minus the minimum. More generally, for <math>n\geq k>j\geq 1</math>, <math>U_{(k)}-U_{(j)} </math> also has a beta distribution: <math display="block">U_{(k)}-U_{(j)}\sim \operatorname{Beta}(k-j, n-(k-j)+1)</math>From these formulas we can derive the covariance between two order statistics:<math display="block">\operatorname{Cov}(U_{(k)},U_{(j)})=\frac{j(n-k+1)}{(n+1)^2(n+2)}</math>The formula follows from noting that <math display="block">\operatorname{Var}(U_{(k)}-U_{(j)})=\operatorname{Var}(U_{(k)}) + \operatorname{Var}(U_{(j)})-2\cdot \operatorname{Cov}(U_{(k)},U_{(j)}) =\frac{k(n-k+1)}{(n+1)^2(n+2)}+\frac{j(n-j+1)}{(n+1)^2(n+2)}-2\cdot \operatorname{Cov}(U_{(k)},U_{(j)})</math>and comparing that with <math display="block">\operatorname{Var}(U)=\frac{(k-j)(n-(k-j)+1)}{(n+1)^2(n+2)}</math>where <math>U\sim \operatorname{Beta}(k-j,n-(k-j)+1)</math>, which is the actual distribution of the difference. ==== Order statistics sampled from an exponential distribution ==== For <math>X_1, X_2, .., X_n</math> a random sample of size ''n'' from an [[exponential distribution]] with parameter ''λ'', the order statistics ''X''<sub>(''i'')</sub> for ''i'' = 1,2,3, ..., ''n'' each have distribution ::<math>X_{(i)} \stackrel{d}{=} \frac{1}{\lambda}\left( \sum_{j=1}^i \frac{Z_j}{n-j+1} \right)</math> where the ''Z''<sub>''j''</sub> are iid standard exponential random variables (i.e. with rate parameter 1). This result was first published by [[Alfréd Rényi]].<ref>{{Citation | last1 = David | first1 = H. A. | last2 = Nagaraja | first2 = H. N. | title = Order Statistics | pages = 9 | year = 2003 | chapter = Chapter 2. Basic Distribution Theory | doi = 10.1002/0471722162.ch2 | series = Wiley Series in Probability and Statistics | isbn = 9780471722168 }}</ref><ref>{{cite journal |last = Rényi |first = Alfréd | author-link = Alfréd Rényi |title = On the theory of order statistics |journal = [[Acta Mathematica Hungarica]] |volume = 4 |issue = 3 |pages = 191–231 |date = 1953 |doi = 10.1007/BF02127580 | doi-access=free }}</ref> ==== Order statistics sampled from an Erlang distribution ==== The [[Laplace transform]] of order statistics may be sampled from an [[Erlang distribution]] via a path counting method {{Clarify|reason=Unclear: Scope and relevance. Order statistics of Erlang RVs? Are we sampling from Erlang RVs to learn simulate order statistics of other RVs. Why do we care?|date=February 2019}}.<ref>{{Cite journal | last1 = Hlynka | first1 = M. | last2 = Brill | first2 = P. H. | last3 = Horn | first3 = W. | title = A method for obtaining Laplace transforms of order statistics of Erlang random variables | doi = 10.1016/j.spl.2009.09.006 | journal = Statistics & Probability Letters | volume = 80 | pages = 9–18 | year = 2010 }}</ref> ==== The joint distribution of the order statistics of an absolutely continuous distribution ==== If ''F''<sub>''X''</sub> is [[absolute continuity|absolutely continuous]], it has a density such that <math>dF_X(x)=f_X(x)\,dx</math>, and we can use the substitutions :<math>u=F_X(x)</math> and :<math>du=f_X(x)\,dx</math> to derive the following probability density functions for the order statistics of a sample of size ''n'' drawn from the distribution of ''X'': :<math>f_{X_{(k)}}(x) =\frac{n!}{(k-1)!(n-k)!}[F_X(x)]^{k-1}[1-F_X(x)]^{n-k} f_X(x)</math> :<math>f_{X_{(j)},X_{(k)}}(x,y) = \frac{n!}{(j-1)!(k-j-1)!(n-k)!}[F_X(x)]^{j-1}[F_X(y)-F_X(x)]^{k-1-j}[1-F_X(y)]^{n-k}f_X(x)f_X(y)</math> where <math>x\le y</math> :<math>f_{X_{(1)},\ldots,X_{(n)}}(x_1,\ldots,x_n)=n!f_X(x_1)\cdots f_X(x_n)</math> where <math>x_1\le x_2\le \dots \le x_n.</math>
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Order statistic
(section)
Add topic