Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Variance
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Sum of variables=== ====Sum of uncorrelated variables==== {{main article|BienaymĂ©'s identity}} {{see also|Sum of normally distributed random variables}} One reason for the use of the variance in preference to other measures of dispersion is that the variance of the sum (or the difference) of [[uncorrelated]] random variables is the sum of their variances: <math display="block">\operatorname{Var}\left(\sum_{i=1}^n X_i\right) = \sum_{i=1}^n \operatorname{Var}(X_i).</math> This statement is called the [[IrĂ©nĂ©e-Jules BienaymĂ©|BienaymĂ©]] formula<ref>[[Michel LoĂšve|LoĂšve, M.]] (1977) "Probability Theory", ''Graduate Texts in Mathematics'', Volume 45, 4th edition, Springer-Verlag, p. 12.</ref> and was discovered in 1853.<ref>[[IrĂ©nĂ©e-Jules BienaymĂ©|BienaymĂ©, I.-J.]] (1853) "ConsidĂ©rations Ă l'appui de la dĂ©couverte de Laplace sur la loi de probabilitĂ© dans la mĂ©thode des moindres carrĂ©s", ''Comptes rendus de l'AcadĂ©mie des sciences Paris'', 37, p. 309â317; digital copy available [http://visualiseur.bnf.fr/CadresFenetre?O=NUMM-2994&I=313] {{Webarchive|url=https://web.archive.org/web/20180623145935/http://visualiseur.bnf.fr/CadresFenetre?O=NUMM-2994&I=313|date=2018-06-23}}</ref><ref>[[IrĂ©nĂ©e-Jules BienaymĂ©|BienaymĂ©, I.-J.]] (1867) "ConsidĂ©rations Ă l'appui de la dĂ©couverte de Laplace sur la loi de probabilitĂ© dans la mĂ©thode des moindres carrĂ©s", ''Journal de MathĂ©matiques Pures et AppliquĂ©es, SĂ©rie 2'', Tome 12, p. 158â167; digital copy available [http://gallica.bnf.fr/ark:/12148/bpt6k16411c/f166.image.n19][http://sites.mathdoc.fr/JMPA/PDF/JMPA_1867_2_12_A10_0.pdf]</ref> It is often made with the stronger condition that the variables are [[statistical independence|independent]], but being uncorrelated suffices. So if all the variables have the same variance Ï<sup>2</sup>, then, since division by ''n'' is a linear transformation, this formula immediately implies that the variance of their mean is <math display="block"> \operatorname{Var}\left(\overline{X}\right) = \operatorname{Var}\left(\frac{1}{n} \sum_{i=1}^n X_i\right) = \frac{1}{n^2}\sum_{i=1}^n \operatorname{Var}\left(X_i\right) = \frac{1}{n^2}n\sigma^2 = \frac{\sigma^2}{n}. </math> That is, the variance of the mean decreases when ''n'' increases. This formula for the variance of the mean is used in the definition of the [[standard error (statistics)|standard error]] of the sample mean, which is used in the [[central limit theorem]]. To prove the initial statement, it suffices to show that <math display="block">\operatorname{Var}(X + Y) = \operatorname{Var}(X) + \operatorname{Var}(Y).</math> The general result then follows by induction. Starting with the definition, <math display="block">\begin{align} \operatorname{Var}(X + Y) &= \operatorname{E}\left[(X + Y)^2\right] - (\operatorname{E}[X + Y])^2 \\[5pt] &= \operatorname{E}\left[X^2 + 2XY + Y^2\right] - (\operatorname{E}[X] + \operatorname{E}[Y])^2. \end{align}</math> Using the linearity of the [[Expectation Operator|expectation operator]] and the assumption of independence (or uncorrelatedness) of ''X'' and ''Y'', this further simplifies as follows: <math display="block">\begin{align} \operatorname{Var}(X + Y) &= \operatorname{E}{\left[X^2\right]} + 2\operatorname{E}[XY] + \operatorname{E}{\left[Y^2\right]} - \left(\operatorname{E}[X]^2 + 2\operatorname{E}[X] \operatorname{E}[Y] + \operatorname{E}[Y]^2\right) \\[5pt] &= \operatorname{E}\left[X^2\right] + \operatorname{E}\left[Y^2\right] - \operatorname{E}[X]^2 - \operatorname{E}[Y]^2 \\[5pt] &= \operatorname{Var}(X) + \operatorname{Var}(Y). \end{align}</math> ====Sum of correlated variables==== =====Sum of correlated variables with fixed sample size===== {{main article|BienaymĂ©'s identity}} In general, the variance of the sum of {{math|n}} variables is the sum of their [[covariance]]s: <math display="block">\operatorname{Var}\left(\sum_{i=1}^n X_i\right) = \sum_{i=1}^n \sum_{j=1}^n \operatorname{Cov}\left(X_i, X_j\right) = \sum_{i=1}^n \operatorname{Var}\left(X_i\right) + 2 \sum_{1 \leq i < j\leq n} \operatorname{Cov}\left(X_i, X_j\right).</math> (Note: The second equality comes from the fact that {{math|1=Cov(''X''<sub>''i''</sub>,''X''<sub>''i''</sub>) = Var(''X''<sub>''i''</sub>)}}.) Here, <math>\operatorname{Cov}(\cdot,\cdot)</math> is the [[covariance]], which is zero for independent random variables (if it exists). The formula states that the variance of a sum is equal to the sum of all elements in the covariance matrix of the components. The next expression states equivalently that the variance of the sum is the sum of the diagonal of covariance matrix plus two times the sum of its upper triangular elements (or its lower triangular elements); this emphasizes that the covariance matrix is symmetric. This formula is used in the theory of [[Cronbach's alpha]] in [[classical test theory]]. So, if the variables have equal variance ''Ï''<sup>2</sup> and the average [[correlation]] of distinct variables is ''Ï'', then the variance of their mean is <math display="block">\operatorname{Var}\left(\overline{X}\right) = \frac{\sigma^2}{n} + \frac{n - 1}{n}\rho\sigma^2.</math> This implies that the variance of the mean increases with the average of the correlations. In other words, additional correlated observations are not as effective as additional independent observations at reducing the [[standard error|uncertainty of the mean]]. Moreover, if the variables have unit variance, for example if they are standardized, then this simplifies to <math display="block">\operatorname{Var}\left(\overline{X}\right) = \frac{1}{n} + \frac{n - 1}{n}\rho.</math> This formula is used in the [[SpearmanâBrown prediction formula]] of classical test theory. This converges to ''Ï'' if ''n'' goes to infinity, provided that the average correlation remains constant or converges too. So for the variance of the mean of standardized variables with equal correlations or converging average correlation we have <math display="block">\lim_{n \to \infty} \operatorname{Var}\left(\overline{X}\right) = \rho.</math> Therefore, the variance of the mean of a large number of standardized variables is approximately equal to their average correlation. This makes clear that the sample mean of correlated variables does not generally converge to the population mean, even though the [[law of large numbers]] states that the sample mean will converge for independent variables. =====Sum of uncorrelated variables with random sample size===== There are cases when a sample is taken without knowing, in advance, how many observations will be acceptable according to some criterion. In such cases, the sample size {{math|N}} is a random variable whose variation adds to the variation of {{math|X}}, such that,<ref>Cornell, J R, and Benjamin, C A, ''Probability, Statistics, and Decisions for Civil Engineers,'' McGraw-Hill, NY, 1970, pp.178-9.</ref> <math display="block">\operatorname{Var}\left(\sum_{i=1}^{N}X_i\right)=\operatorname{E}\left[N\right]\operatorname{Var}(X)+\operatorname{Var}(N)(\operatorname{E}\left[X\right])^2</math> which follows from the [[law of total variance]]. If {{math|N}} has a [[Poisson distribution]], then <math>\operatorname{E}[N]=\operatorname{Var}(N)</math> with estimator {{math|n}} = {{math|N}}. So, the estimator of <math>\operatorname{Var}\left(\sum_{i=1}^{n}X_i\right)</math> becomes <math>n{S_x}^2+n\bar{X}^2</math>, giving <math>\operatorname{SE}(\bar{X})=\sqrt{\frac{{S_x}^2+\bar{X}^2}{n}}</math> (see [[Standard error#Standard_error_of_the_sample_mean|standard error of the sample mean]]). ====Weighted sum of variables==== {{see also|Weighted arithmetic mean#Variance{{!}}Variance of a weighted arithmetic mean}} {{distinguish|Weighted variance}} The scaling property and the BienaymĂ© formula, along with the property of the [[covariance]] {{math|Cov(''aX'', ''bY'') {{=}} ''ab'' Cov(''X'', ''Y'')}} jointly imply that <math display="block">\operatorname{Var}(aX \pm bY) =a^2 \operatorname{Var}(X) + b^2 \operatorname{Var}(Y) \pm 2ab\, \operatorname{Cov}(X, Y).</math> This implies that in a weighted sum of variables, the variable with the largest weight will have a disproportionally large weight in the variance of the total. For example, if ''X'' and ''Y'' are uncorrelated and the weight of ''X'' is two times the weight of ''Y'', then the weight of the variance of ''X'' will be four times the weight of the variance of ''Y''. The expression above can be extended to a weighted sum of multiple variables: <math display="block">\operatorname{Var}\left(\sum_{i}^n a_iX_i\right) = \sum_{i=1}^na_i^2 \operatorname{Var}(X_i) + 2\sum_{1\le i}\sum_{<j\le n}a_ia_j\operatorname{Cov}(X_i,X_j)</math>
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Variance
(section)
Add topic