Editing Information retrieval (section)

== History ==
{{Rquote|right|there is ... a machine called the Univac ... whereby letters and figures are coded as a pattern of magnetic spots on a long steel tape. By this means the text of a document, preceded by its subject code symbol, can be recorded ... the machine ... automatically selects and types out those references which have been coded in any desired way at a rate of 120 words a minute| J. E. Holmstrom, 1948}}
The idea of using computers to search for relevant pieces of information was popularized in the article ''[[As We May Think]]'' by [[Vannevar Bush]] in 1945.<ref name="Singhal2001">{{cite journal |last=Singhal |first=Amit |title=Modern Information Retrieval: A Brief Overview |journal=Bulletin of the IEEE Computer Society Technical Committee on Data Engineering|volume=24 |issue=4 |pages=35–43 |year =2001 |url=http://singhal.info/ieee2001.pdf }}</ref> It would appear that Bush was inspired by patents for a 'statistical machine' – filed by [[Emanuel Goldberg]] in the 1920s and 1930s – that searched for documents stored on film.<ref name="Sanderson2012">{{cite journal |author=Mark Sanderson & W. Bruce Croft |title=The History of Information Retrieval Research |journal=Proceedings of the IEEE |volume=100 |pages=1444–1451 |year =2012 |doi=10.1109/jproc.2012.2189916|doi-access=free }}</ref> The first description of a computer searching for information was described by Holmstrom in 1948,<ref name="Holmstrom1948">{{cite journal |author=JE Holmstrom |title='Section III. Opening Plenary Session |journal=The Royal Society Scientific Information Conference, 21 June-2 July 1948: Report and Papers Submitted |pages=85|year =1948|url=https://books.google.com/books?id=M34lAAAAMAAJ&q=univac}}</ref> detailing an early mention of the [[Univac]] computer. Automated information retrieval systems were introduced in the 1950s: one even featured in the 1957 romantic comedy ''[[Desk Set]]''. In the 1960s, the first large information retrieval research group was formed by [[Gerard Salton]] at Cornell. By the 1970s several different retrieval techniques had been shown to perform well on small [[text corpora]] such as the Cranfield collection (several thousand documents).<ref name="Singhal2001" /> Large-scale retrieval systems, such as the Lockheed Dialog system, came into use early in the 1970s.

In 1992, the US Department of Defense along with the [[National Institute of Standards and Technology]] (NIST), cosponsored the [[Text Retrieval Conference]] (TREC) as part of the TIPSTER text program. The aim of this was to look into the information retrieval community by supplying the infrastructure that was needed for evaluation of text retrieval methodologies on a very large text collection. This catalyzed research on methods that [[scalability|scale]] to huge corpora. The introduction of [[web search engine]]s has boosted the need for very large scale retrieval systems even further.

By the late 1990s, the rise of the World Wide Web fundamentally transformed information retrieval. While early search engines such as [[AltaVista]] (1995) and [[Yahoo! Inc. (1995–2017)|Yahoo!]] (1994) offered keyword-based retrieval, they were limited in scale and ranking refinement. The breakthrough came in 1998 with the founding of [[Google]], which introduced the [[PageRank]] algorithm,<ref name=":2">{{Cite web |title=The Anatomy of a Search Engine |url=http://infolab.stanford.edu/~backrub/google.html |access-date=2025-04-09 |website=infolab.stanford.edu}}</ref> using the web’s hyperlink structure to assess page importance and improve relevance ranking.

During the 2000s, web search systems evolved rapidly with the integration of machine learning techniques. These systems began to incorporate user behavior data (e.g., click-through logs), query reformulation, and content-based signals to improve search accuracy and personalization. In 2009, [[Microsoft]] launched [[Microsoft Bing|Bing]], introducing features that would later incorporate [[Semantic Web|semantic]] web technologies through the development of its Satori knowledge base. Academic analysis<ref name=":3">{{Cite journal |last1=Uyar |first1=Ahmet |last2=Aliyu |first2=Farouk Musa |date=2015-01-01 |title=Evaluating search features of Google Knowledge Graph and Bing Satori: Entity types, list searches and query interfaces |url=https://www.emerald.com/insight/content/doi/10.1108/oir-10-2014-0257/full/html |journal=Online Information Review |volume=39 |issue=2 |pages=197–213 |doi=10.1108/OIR-10-2014-0257 |issn=1468-4527}}</ref> have highlighted Bing’s semantic capabilities, including structured data use and entity recognition, as part of a broader industry shift toward improving search relevance and understanding user intent through natural language processing.

A major leap occurred in 2018, when Google deployed [[BERT (language model)|BERT]] ('''B'''idirectional '''E'''ncoder '''R'''epresentations from '''T'''ransformers) to better understand the contextual meaning of queries and documents. This marked one of the first times deep neural language models were used at scale in real-world retrieval systems.<ref name=":4">{{cite arXiv | eprint=1810.04805 | last1=Devlin | first1=Jacob | last2=Chang | first2=Ming-Wei | last3=Lee | first3=Kenton | last4=Toutanova | first4=Kristina | title=BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | date=2018 | class=cs.CL }}</ref> BERT’s bidirectional training enabled a more refined comprehension of word relationships in context, improving the handling of natural language queries. Because of its success, transformer-based models gained traction in academic research and commercial search applications.<ref>{{Cite journal |last1=Gardazi |first1=Nadia Mushtaq |last2=Daud |first2=Ali |last3=Malik |first3=Muhammad Kamran |last4=Bukhari |first4=Amal |last5=Alsahfi |first5=Tariq |last6=Alshemaimri |first6=Bader |date=2025-03-15 |title=BERT applications in natural language processing: a review |url=https://link.springer.com/article/10.1007/s10462-025-11162-5 |journal=Artificial Intelligence Review |language=en |volume=58 |issue=6 |pages=166 |doi=10.1007/s10462-025-11162-5 |issn=1573-7462|doi-access=free }}</ref>

Simultaneously, the research community began exploring neural ranking models that outperformed traditional lexical-based methods. Long-standing benchmarks such as the '''T'''ext '''RE'''trieval '''C'''onference ([[Text Retrieval Conference|TREC]]), initiated in 1992, and more recent evaluation frameworks  Microsoft MARCO('''MA'''chine '''R'''eading '''CO'''mprehension) (2019)<ref name=":5">{{cite arXiv | eprint=1611.09268 | last1=Bajaj | first1=Payal | last2=Campos | first2=Daniel | last3=Craswell | first3=Nick | last4=Deng | first4=Li | last5=Gao | first5=Jianfeng | last6=Liu | first6=Xiaodong | last7=Majumder | first7=Rangan | last8=McNamara | first8=Andrew | last9=Mitra | first9=Bhaskar | last10=Nguyen | first10=Tri | last11=Rosenberg | first11=Mir | last12=Song | first12=Xia | last13=Stoica | first13=Alina | last14=Tiwary | first14=Saurabh | last15=Wang | first15=Tong | title=MS MARCO: A Human Generated MAchine Reading COmprehension Dataset | date=2016 | class=cs.CL }}</ref> became central to training and evaluating retrieval systems across multiple tasks and domains. MS MARCO has also been adopted in the TREC Deep Learning Tracks, where it serves as a core dataset for evaluating advances in neural ranking models within a standardized benchmarking environment.<ref>{{Cite journal |last1=Craswell |first1=Nick |last2=Mitra |first2=Bhaskar |last3=Yilmaz |first3=Emine |last4=Rahmani |first4=Hossein A. |last5=Campos |first5=Daniel |last6=Lin |first6=Jimmy |last7=Voorhees |first7=Ellen M. |last8=Soboroff |first8=Ian |date=2024-02-28 |title=Overview of the TREC 2023 Deep Learning Track |url=https://www.microsoft.com/en-us/research/publication/overview-of-the-trec-2023-deep-learning-track/ |language=en-US}}</ref>

As deep learning became integral to information retrieval systems, researchers began to categorize neural approaches into three broad classes: '''sparse''', '''dense''', and '''hybrid''' models. Sparse models, including traditional term-based methods and learned variants like SPLADE, rely on interpretable representations and inverted indexes to enable efficient exact term matching with added semantic signals.<ref name=":0">{{arxiv|2107.09226 }}</ref> Dense models, such as dual-encoder architectures like ColBERT, use continuous vector embeddings to support semantic similarity beyond keyword overlap.<ref name=":8">{{Cite book |last1=Khattab |first1=Omar |last2=Zaharia |first2=Matei |chapter=ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT |date=2020-07-25 |title=Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval |chapter-url=https://dl.acm.org/doi/10.1145/3397271.3401075 |series=SIGIR '20 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=39–48 |doi=10.1145/3397271.3401075 |isbn=978-1-4503-8016-4}}</ref> Hybrid models aim to combine the advantages of both, balancing the lexical (token) precision of sparse methods with the semantic depth of dense models. This way of categorizing models balances scalability, relevance, and efficiency in retrieval systems.<ref name=":1">{{cite arXiv | eprint=2010.06467 | last1=Lin | first1=Jimmy | last2=Nogueira | first2=Rodrigo | last3=Yates | first3=Andrew | title=Pretrained Transformers for Text Ranking: BERT and Beyond | date=2020 | class=cs.IR }}</ref>

As IR systems increasingly rely on deep learning, concerns around bias, fairness, and explainability have also come to the picture. Research is now focused not just on relevance and efficiency, but on transparency, accountability, and user trust in retrieval algorithms.