Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Psychometrics
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Key concepts === Key concepts in classical test theory are [[Reliability (psychometric)|reliability]] and [[Test validity|validity]]. A reliable measure is one that measures a construct consistently across time, individuals, and situations. A valid measure is one that measures what it is intended to measure. Reliability is necessary, but not sufficient, for validity. Both reliability and validity can be assessed statistically. Consistency over repeated measures of the same test can be assessed with the Pearson correlation coefficient, and is often called ''test-retest reliability.''<ref name="gifted.uconn">{{cite web|url=http://www.gifted.uconn.edu/Siegle/research/Instrument+Reliability+and+Validity/Reliability.htm|title=Home – Educational Research Basics by Del Siegle|website=www.gifted.uconn.edu|date=17 February 2015}}</ref> Similarly, the equivalence of different versions of the same measure can be indexed by a [[Pearson product-moment correlation coefficient|Pearson correlation]], and is called ''equivalent forms reliability'' or a similar term.<ref name="gifted.uconn"/> Internal consistency, which addresses the homogeneity of a single test form, may be assessed by correlating performance on two halves of a test, which is termed ''split-half reliability''; the value of this [[Pearson product-moment correlation coefficient]] for two half-tests is adjusted with the [[Spearman–Brown prediction formula]] to correspond to the correlation between two full-length tests.<ref name="gifted.uconn"/> Perhaps the most commonly used index of reliability is [[Cronbach's α]], which is equivalent to the [[mean]] of all possible split-half coefficients. Other approaches include the [[intra-class correlation]], which is the ratio of variance of measurements of a given target to the variance of all targets. There are a number of different forms of validity. [[Criterion validity|Criterion-related validity]] refers to the extent to which a test or scale predicts a sample of behavior, i.e., the criterion, that is "external to the measuring instrument itself."<ref>Nunnally, J.C. (1978). ''Psychometric theory'' (2nd ed.). New York: McGraw-Hill.</ref> That external sample of behavior can be many things including another test; college grade point average as when the high school SAT is used to predict performance in college; and even behavior that occurred in the past, for example, when a test of current psychological symptoms is used to predict the occurrence of past victimization (which would accurately represent postdiction). When the criterion measure is collected at the same time as the measure being validated the goal is to establish ''[[concurrent validity]]''; when the criterion is collected later the goal is to establish ''[[predictive validity]]''. A measure has ''[[construct validity]]'' if it is related to measures of other constructs as required by theory. ''[[Content validity]]'' is a demonstration that the items of a test do an adequate job of covering the domain being measured. In a personnel selection example, test content is based on a defined statement or set of statements of knowledge, skill, ability, or other characteristics obtained from a ''[[job analysis]]''. [[Item response theory]] models the relationship between [[latent trait]]s and responses to test items. Among other advantages, IRT provides a basis for obtaining an estimate of the location of a test-taker on a given latent trait as well as the standard error of measurement of that location. For example, a university student's knowledge of history can be deduced from his or her score on a university test and then be compared reliably with a high school student's knowledge deduced from a less difficult test. Scores derived by classical test theory do not have this characteristic, and assessment of actual ability (rather than ability relative to other test-takers) must be assessed by comparing scores to those of a "norm group" randomly selected from the population. In fact, all measures derived from classical test theory are dependent on the sample tested, while, in principle, those derived from item response theory are not.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Psychometrics
(section)
Add topic