Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Geometric distribution
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Properties == === Memorylessness === {{Main article|Memorylessness}} The geometric distribution is the only memoryless discrete probability distribution.<ref>{{Cite book |last1=Dekking |first1=Frederik Michel |url=http://link.springer.com/10.1007/1-84628-168-7 |title=A Modern Introduction to Probability and Statistics |last2=Kraaikamp |first2=Cornelis |last3=Lopuhaä |first3=Hendrik Paul |last4=Meester |first4=Ludolf Erwin |date=2005 |publisher=Springer London |isbn=978-1-85233-896-1 |series=Springer Texts in Statistics |location=London |page=50 |language=en |doi=10.1007/1-84628-168-7}}</ref> It is the discrete version of the same property found in the [[exponential distribution]].<ref name=":8">{{Cite book |last1=Johnson |first1=Norman L. |url=https://onlinelibrary.wiley.com/doi/book/10.1002/0471715816 |title=Univariate Discrete Distributions |last2=Kemp |first2=Adrienne W.|author2-link=Adrienne W. Kemp |last3=Kotz |first3=Samuel |date=2005-08-19 |publisher=Wiley |isbn=978-0-471-27246-5 |edition=1 |series=Wiley Series in Probability and Statistics |page= |language=en |doi=10.1002/0471715816}}</ref>{{Rp|page=228}} The property asserts that the number of previously failed trials does not affect the number of future trials needed for a success. Because there are two definitions of the geometric distribution, there are also two definitions of memorylessness for discrete random variables.<ref>{{Cite web |last=Weisstein |first=Eric W. |title=Memoryless |url=https://mathworld.wolfram.com/ |access-date=2024-07-25 |website=mathworld.wolfram.com |language=en}}</ref> Expressed in terms of [[conditional probability]], the two definitions are<math display="block">\Pr(X>m+n\mid X>n)=\Pr(X>m),</math> and<math display="block">\Pr(Y>m+n\mid Y\geq n)=\Pr(Y>m),</math> where <math>m</math> and <math>n</math> are [[Natural number|natural numbers]], <math>X</math> is a geometrically distributed random variable defined over <math>\mathbb{N}</math>, and <math>Y</math> is a geometrically distributed random variable defined over <math>\mathbb{N}_0</math>. Note that these definitions are not equivalent for discrete random variables; <math>Y</math> does not satisfy the first equation and <math>X</math> does not satisfy the second. ===Moments and cumulants=== The [[expected value]] and [[variance]] of a geometrically distributed [[random variable]] <math>X</math> defined over <math>\mathbb{N}</math> is<ref name=":1" />{{Rp|page=261}}<math display="block">\operatorname{E}(X) = \frac{1}{p}, \qquad\operatorname{var}(X) = \frac{1-p}{p^2}.</math> With a geometrically distributed random variable <math>Y</math> defined over <math>\mathbb{N}_0</math>, the expected value changes into<math display="block">\operatorname{E}(Y) = \frac{1-p} p,</math>while the variance stays the same.<ref name=":0">{{Cite book |last1=Forbes |first1=Catherine |url=https://onlinelibrary.wiley.com/doi/book/10.1002/9780470627242 |title=Statistical Distributions |last2=Evans |first2=Merran |last3=Hastings |first3=Nicholas |last4=Peacock |first4=Brian |date=2010-11-29 |publisher=Wiley |isbn=978-0-470-39063-4 |edition=1st |pages= |language=en |doi=10.1002/9780470627242}}</ref>{{Rp|pages=114–115}} For example, when rolling a six-sided die until landing on a "1", the average number of rolls needed is <math>\frac{1}{1/6} = 6</math> and the average number of failures is <math>\frac{1 - 1/6}{1/6} = 5</math>. The [[Moment-generating function|moment generating function]] of the geometric distribution when defined over <math> \mathbb{N} </math> and <math>\mathbb{N}_0</math> respectively is<ref>{{Cite book |last1=Bertsekas |first1=Dimitri P. |url=https://archive.org/details/introductiontopr0000bert_p5i9_2ndedi |title=Introduction to probability |last2=Tsitsiklis |first2=John N. |publisher=Athena Scientific |year=2008 |isbn=978-1-886529-23-6 |edition=2nd |series=Optimization and computation series |location=Belmont |page=235 |language=en}}</ref><ref name=":0" />{{Rp|page=114}}<math display="block">\begin{align} M_X(t) &= \frac{pe^t}{1-(1-p)e^t} \\ M_Y(t) &= \frac{p}{1-(1-p)e^t}, t < -\ln(1-p) \end{align}</math>The moments for the number of failures before the first success are given by : <math> \begin{align} \mathrm{E}(Y^n) & {} =\sum_{k=0}^\infty (1-p)^k p\cdot k^n \\ & {} =p \operatorname{Li}_{-n}(1-p) & (\text{for }n \neq 0) \end{align} </math> where <math> \operatorname{Li}_{-n}(1-p) </math> is the [[Polylogarithm|polylogarithm function]].<ref>{{Cite web |last=Weisstein |first=Eric W. |title=Geometric Distribution |url=https://mathworld.wolfram.com/ |access-date=2024-07-13 |website=[[MathWorld]] |language=en}}</ref> The [[cumulant generating function]] of the geometric distribution defined over <math>\mathbb{N}_0</math> is<ref name=":8" />{{Rp|page=216}} <math display="block">K(t) = \ln p - \ln (1 - (1-p)e^t)</math>The [[cumulant]]s <math>\kappa_r</math> satisfy the recursion<math display="block">\kappa_{r+1} = q \frac{\delta\kappa_r}{\delta q}, r=1,2,\dotsc</math>where <math>q = 1-p</math>, when defined over <math>\mathbb{N}_0</math>.<ref name=":8" />{{Rp|page=216}} ==== Proof of expected value ==== Consider the expected value <math>\mathrm{E}(X)</math> of ''X'' as above, i.e. the average number of trials until a success. The first trial either succeeds with probability <math>p</math>, or fails with probability <math>1-p</math>. If it fails, the '''remaining''' mean number of trials until a success is identical to the original mean - this follows from the fact that all trials are independent. From this we get the formula: : <math>\operatorname \mathrm{E}(X) = p + (1-p)(1 + \mathrm{E}[X]) ,</math> which, when solved for <math> \mathrm{E}(X) </math>, gives: : <math>\operatorname E(X) = \frac{1}{p}.</math> The expected number of '''failures''' <math>Y</math> can be found from the [[linearity of expectation]], <math>\mathrm{E}(Y) = \mathrm{E}(X-1) = \mathrm{E}(X) - 1 = \frac 1 p - 1 = \frac{1-p}{p}</math>. It can also be shown in the following way: : <math> \begin{align} \operatorname E(Y) & =p\sum_{k=0}^\infty(1-p)^k k \\ & = p (1-p) \sum_{k=0}^\infty (1-p)^{k-1} k\\ & = p (1-p) \left(-\sum_{k=0}^\infty \frac{d}{dp}\left[(1-p)^k\right]\right) \\ & = p (1-p) \left[\frac{d}{dp}\left(-\sum_{k=0}^\infty (1-p)^k\right)\right] \\ & = p(1-p)\frac{d}{dp}\left(-\frac{1}{p}\right) \\ & = \frac{1-p}{p}. \end{align} </math> The interchange of summation and differentiation is justified by the fact that convergent [[power series]] [[uniform convergence|converge uniformly]] on [[compact space|compact]] subsets of the set of points where they converge. === Summary statistics === The [[mean]] of the geometric distribution is its expected value which is, as previously discussed in [[Geometric distribution#Moments and cumulants|§ Moments and cumulants]], <math>\frac{1}{p}</math> or <math>\frac{1-p}{p}</math> when defined over <math>\mathbb{N}</math> or <math>\mathbb{N}_0</math> respectively. The [[median]] of the geometric distribution is <math>\left\lceil -\frac{\log 2}{\log(1-p)} \right\rceil</math>when defined over <math>\mathbb{N}</math><ref>{{Cite book |last=Aggarwal |first=Charu C. |url=https://link.springer.com/10.1007/978-3-031-53282-5 |title=Probability and Statistics for Machine Learning: A Textbook |publisher=Springer Nature Switzerland |year=2024 |isbn=978-3-031-53281-8 |location=Cham |page=138 |language=en |doi=10.1007/978-3-031-53282-5}}</ref> and <math>\left\lfloor-\frac{\log 2}{\log(1-p)}\right\rfloor</math> when defined over <math>\mathbb{N}_0</math>.<ref name=":2" />{{Rp|page=69}} The [[Mode (statistics)|mode]] of the geometric distribution is the first value in the support set. This is 1 when defined over <math>\mathbb{N}</math> and 0 when defined over <math>\mathbb{N}_0</math>.<ref name=":2" />{{Rp|page=69}} The [[skewness]] of the geometric distribution is <math>\frac{2-p}{\sqrt{1-p}}</math>.<ref name=":0" />{{Rp|pages=|page=115}} The [[Kurtosis risk|kurtosis]] of the geometric distribution is <math>9 + \frac{p^2}{1-p}</math>.<ref name=":0" />{{Rp|pages=|page=115}} The [[excess kurtosis]] of a distribution is the difference between its kurtosis and the kurtosis of a [[normal distribution]], <math>3</math>.<ref name=":4">{{Cite book |last=Chan |first=Stanley |url=https://probability4datascience.com/ |title=Introduction to Probability for Data Science |publisher=[[Michigan Publishing]] |year=2021 |isbn=978-1-60785-747-1 |edition=1st |language=en}}</ref>{{Rp|pages=|page=217}} Therefore, the excess kurtosis of the geometric distribution is <math>6 + \frac{p^2}{1-p}</math>. Since <math>\frac{p^2}{1-p} \geq 0</math>, the excess kurtosis is always positive so the distribution is [[leptokurtic]].<ref name=":2" />{{Rp|page=69}} In other words, the tail of a geometric distribution decays faster than a Gaussian.<ref name=":4" />{{Rp|pages=|page=217}}
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Geometric distribution
(section)
Add topic