Editing Median (section)

==Other median-related concepts==

===Interpolated median===
When dealing with a discrete variable, it is sometimes useful to regard the observed values as being midpoints of underlying continuous intervals. An example of this is a [[Likert scale]], on which opinions or preferences are expressed on a scale with a set number of possible responses. If the scale consists of the positive integers, an observation of 3 might be regarded as representing the interval from 2.50 to 3.50. It is possible to estimate the median of the underlying variable. If, say, 22% of the observations are of value 2 or below and 55.0% are of 3 or below (so 33% have the value 3), then the median <math> m </math> is 3 since the median is the smallest value of <math> x </math> for which <math> F(x) </math> is greater than a half. But the interpolated median is somewhere between 2.50 and 3.50.  First we add half of the interval width <math> w </math> to the median to get the upper bound of the median interval. Then we subtract that proportion of the interval width which equals the proportion of the 33% which lies above the 50% mark.  In other words, we split up the interval width pro rata to the numbers of observations.  In this case, the 33% is split into 28% below the median and 5% above it so we subtract 5/33 of the interval width from the upper bound of 3.50 to give an interpolated median of 3.35. More formally, if the values <math> f(x) </math> are known, the interpolated median can be calculated from

<math display="block"> m_\text{int} = m + w\left[\frac{1}{2} - \frac{F( m ) - \frac{1}{2} }{f( m )}\right]. </math>

Alternatively, if in an observed sample there are <math> k </math> scores above the median category, <math> j </math> scores in it and <math> i </math> scores below it then the interpolated median is given by

<math display="block"> m_\text{int} = m + \frac{w}{2} \left[\frac{k - i} j\right]. </math>

===Pseudo-median===
{{Main|Pseudomedian}}
For univariate distributions that are ''symmetric'' about one median, the [[Hodges–Lehmann estimator]] is a robust and highly efficient estimator of the population median; for non-symmetric distributions, the Hodges–Lehmann estimator is a robust and highly efficient estimator of the population ''pseudo-median'', which is the median of a symmetrized distribution and which is close to the population median.<ref>{{Cite journal|last1=Pratt|first1=William K.|last2=Cooper|first2=Ted J.|last3=Kabir|first3=Ihtisham|s2cid=173183609|editor1-first=Francis J|editor1-last=Corbett|date=1985-07-11|title=Pseudomedian Filter|journal=Architectures and Algorithms for Digital Image Processing II|volume=0534|pages=34|doi=10.1117/12.946562|bibcode=1985SPIE..534...34P}}</ref> The Hodges–Lehmann estimator has been generalized to multivariate distributions.<ref name="Oja 2010 xiv+232">{{cite book|title=Multivariate nonparametric methods with&nbsp;''R'': An approach based on spatial signs and ranks|last=Oja|first=Hannu|publisher=Springer|year=2010|isbn=978-1-4419-0467-6|series=Lecture Notes in Statistics|volume=199|location=New York, NY|pages=xiv+232|doi=10.1007/978-1-4419-0468-3|mr=2598854}}</ref>

===Variants of regression===
The [[Theil–Sen estimator]] is a method for [[robust statistics|robust]] [[linear regression]] based on finding medians of [[slope]]s.<ref>{{citation
 | last = Wilcox | first = Rand R.
 | contribution = Theil–Sen estimator
 | isbn = 978-0-387-95157-7
 | pages = 207–210
 | publisher = Springer-Verlag
 | title = Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy
 | url = https://books.google.com/books?id=YSFb4QX2UIoC&pg=PA207
 | year = 2001}}.</ref>

===Median filter===
The [[median filter]] is an important tool of [[image processing]], that can effectively remove any [[salt and pepper noise]] from [[grayscale]] images.

===Cluster analysis===
{{main|k-medians clustering}}
In [[cluster analysis]], the [[k-medians clustering]] algorithm provides a way of defining clusters, in which the criterion of maximising the distance between cluster-means that is used in [[k-means clustering]], is replaced by maximising the distance between cluster-medians.

===Median–median line===

This is a method of robust regression. The idea dates back to [[Abraham Wald|Wald]] in 1940 who suggested dividing a set of bivariate data into two halves depending on the value of the independent parameter <math>x</math>: a left half with values less than the median and a right half with values greater than the median.<ref name=Wald1940>{{cite journal |last=Wald |first=A. |year=1940 |title=The Fitting of Straight Lines if Both Variables are Subject to Error |journal=[[Annals of Mathematical Statistics]] |volume=11 |issue=3 |pages=282–300 |jstor=2235677 |doi=10.1214/aoms/1177731868 |url=http://dml.cz/bitstream/handle/10338.dmlcz/103573/AplMat_20-1975-2_3.pdf |doi-access=free }}</ref> He suggested taking the means of the dependent <math>y</math> and independent <math>x</math> variables of the left and the right halves and estimating the slope of the line joining these two points. The line could then be adjusted to fit the majority of the points in the data set.

Nair and Shrivastava in 1942 suggested a similar idea but instead advocated dividing the sample into three equal parts before calculating the means of the subsamples.<ref name=Nair1942>{{cite journal |title=On a Simple Method of Curve Fitting |first1=K. R. |last1=Nair |first2=M. P. |last2=Shrivastava |journal=Sankhyā: The Indian Journal of Statistics |volume=6 |issue=2 |year=1942 |pages=121–132 |jstor=25047749 }}</ref> Brown and Mood in 1951 proposed the idea of using the medians of two subsamples rather the means.<ref name=Brown1951>{{cite book |last1=Brown |first1=G. W. |last2=Mood |first2=A. M. |year=1951 |chapter=On Median Tests for Linear Hypotheses |title=Proc Second Berkeley Symposium on Mathematical Statistics and Probability |location=Berkeley, CA |publisher=University of California Press |pages=159–166 |zbl=0045.08606 }}</ref> Tukey combined these ideas and recommended dividing the sample into three equal size subsamples and estimating the line based on the medians of the subsamples.<ref name=Tukey1971>{{cite book |last=Tukey |first=J. W. |year=1977 |title=Exploratory Data Analysis |location=Reading, MA |publisher=Addison-Wesley |isbn=0201076160 |url=https://archive.org/details/exploratorydataa00tuke_0 }}</ref>