Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Box plot
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Elements== [[File:Box-Plot mit Min-Max Abstand.png|thumb|Figure 2. Box-plot with whiskers from minimum to maximum]] [[File:Box-Plot mit Interquartilsabstand.png|thumb|Figure 3. Same box-plot with whiskers drawn within the 1.5 IQR value]] A boxplot is a standardized way of displaying the dataset based on the [[five-number summary]]: the minimum, the maximum, the sample median, and the first and third quartiles. * '''[[Sample minimum|Minimum]] (''Q''<sub>0</sub> or 0th [[percentile]])''': the lowest data point in the data set excluding any outliers * '''[[Sample maximum|Maximum]] (''Q''<sub>4</sub> or 100th percentile)''': the highest data point in the data set excluding any outliers * '''[[Median]] (''Q''<sub>2</sub> or 50th percentile)''': the middle value in the data set * '''[[First quartile]] (''Q''<sub>1</sub> or 25th percentile)''': also known as the ''lower quartile'' ''q''<sub>''n''</sub>(0.25), it is the median of the lower half of the dataset. * '''[[Third quartile]] (''Q''<sub>3</sub> or 75th percentile)''': also known as the ''upper quartile'' ''q''<sub>''n''</sub>(0.75), it is the median of the upper half of the dataset.<ref>{{cite journal |last1=Holmes |first1=Alexander |last2=Illowsky |first2=Barbara |last3=Dean |first3=Susan |title=Introductory Business Statistics |website=OpenStax |date=31 March 2015 |url=https://opentextbc.ca/introbusinessstatopenstax/chapter/measures-of-the-location-of-the-data/ |access-date=29 April 2020 |archive-date=27 July 2020 |archive-url=https://web.archive.org/web/20200727025431/https://opentextbc.ca/introbusinessstatopenstax/chapter/measures-of-the-location-of-the-data/ |url-status=dead }}</ref> In addition to the minimum and maximum values used to construct a box-plot, another important element that can also be employed to obtain a box-plot is the interquartile range (IQR), as denoted below: * '''[[Interquartile range]] (IQR)''': the distance between the upper and lower quartiles :: <math>\text{IQR} = Q_3 - Q_1 = q_n(0.75) - q_n(0.25)</math> A box-plot usually includes two parts, a box and a set of whiskers as shown in Figure 2. ===Box=== The box is drawn from ''Q''<sub>1</sub> to ''Q''<sub>3</sub> with a horizontal line drawn inside it to denote the median. Some box plots include an additional character to represent the mean of the data.<ref name="frigge hoaglin iglewicz2">{{Cite journal|last1=Frigge|first1=Michael|last2=Hoaglin|first2=David C.|last3=Iglewicz|first3=Boris|date=February 1989|title=Some Implementations of the Boxplot|journal=[[The American Statistician]]|volume=43|issue=1|pages=50β54|doi=10.2307/2685173|jstor=2685173}}</ref><ref>{{cite journal|last1=Marmolejo-Ramos|first1=F.|last2=Tian|first2=S.|date=2010|title=The shifting boxplot. A boxplot based on essential summary statistics around the mean|journal=International Journal of Psychological Research|volume=3|issue=1|pages=37β46|doi=10.21500/20112084.823|doi-access=free|hdl=10819/6492|hdl-access=free}}</ref> ===Whiskers=== The whiskers must end at an observed data point, but can be defined in various ways. In the most straightforward method, the boundary of the lower whisker is the minimum value of the data set, and the boundary of the upper whisker is the maximum value of the data set. Because of this variability, it is appropriate to describe the convention that is being used for the whiskers and outliers in the caption of the box-plot. Another popular choice for the boundaries of the whiskers is based on the 1.5 IQR value. From above the upper quartile ('''''Q''<sub>3</sub>'''), a distance of 1.5 times the IQR is measured out and a whisker is drawn ''up to'' the largest observed data point from the dataset that falls within this distance. Similarly, a distance of 1.5 times the IQR is measured out below the lower quartile ('''''Q''<sub>1</sub>''') and a whisker is drawn ''down to'' the lowest observed data point from the dataset that falls within this distance. Because the whiskers must end at an observed data point, the whisker lengths can look unequal, even though 1.5 IQR is the same for both sides. All other observed data points outside the boundary of the whiskers are plotted as '''outliers'''.<ref>{{Cite book |title=A Modern Introduction to Probability and Statistics |url=https://archive.org/details/modernintroducti00dekk_722 |url-access=limited |last=Dekking |first=F.M. |publisher=Springer |year=2005 |isbn=1-85233-896-2 |pages=[https://archive.org/details/modernintroducti00dekk_722/page/n240 234]β238 }}</ref> The outliers can be plotted on the box-plot as a dot, a small circle, a star, ''etc.'' (see example below). [[File:Box Plot Picture.png|thumb|389x389px|This is a picture of a box plot representing data]] There are other representations in which the whiskers can stand for several other things, such as: * One [[standard deviation]] above and below the mean of the data set * The 9th percentile and the 91st percentile of the data set * The 2nd percentile and the 98th percentile of the data set Rarely, box-plot can be plotted without the whiskers. This can be appropriate for sensitive information to avoid whiskers (and outliers) disclosing actual values observed.<ref name="DGRW">{{Cite book|last1=Derrick|first1=Ben|last2=Green|first2=Elizabeth|last3=Ritchie|first3=Felix|last4=White|first4=Paul|date=September 2022|chapter=The Risk of Disclosure When Reporting Commonly Used Univariate Statistics|title=Privacy in Statistical Databases|series=Lecture Notes in Computer Science |volume=13463|pages=119β129|doi=10.1007/978-3-031-13945-1_9|isbn=978-3-031-13944-4 }}</ref> The unusual percentiles 2%, 9%, 91%, 98% are sometimes used for whisker cross-hatches and whisker ends to depict the [[seven-number summary]]. If the data are [[Normal distribution|normally distributed]], the locations of the seven marks on the box plot will be equally spaced. On some box plots, a cross-hatch is placed before the end of each whisker.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Box plot
(section)
Add topic