Editing Human genome (section)

== Molecular organization and gene content ==

The total length of the human [[reference genome]] does not represent the sequence of any specific individual, nor does it represent the sequence of all of the DNA found within a cell. The human reference genome only includes one copy of each of the paired, homologous autosomes plus one copy of each of the two sex chromosomes (X and Y). The total amount of DNA in this reference genome is 3.1 billion base pairs (3.1 Gb).<ref>{{cite web |url=https://www.ensembl.org/Homo_sapiens/Info/Annotation |title=Human genome assembly |author=<!--Not stated-->|date= |website=Ensembl |publisher= |access-date=2024-01-23 |quote=}}</ref>

=== Protein-coding genes ===

Protein-coding sequences represent the most widely studied and best understood component of the human genome. These sequences ultimately lead to the production of all human [[protein]]s, although several biological processes (e.g. [[V(D)J recombination|DNA rearrangements]] and [[alternative splicing|alternative pre-mRNA splicing]]) can lead to the production of many more unique proteins than the number of protein-coding genes. 

The human reference genome contains somewhere between 19,000 and 20,000 protein-coding genes.<ref name="Genome">{{cite web |title=Gene |url=https://www.genome.gov/genetics-glossary/Gene#:~:text=And%20genes%20are%20the%20part,of%20the%20entire%20human%20genome. |website=www.genome.gov |access-date=7 January 2025 |language=en}}</ref><ref name = Amaraletal2023 >{{cite journal | vauthors = Amaral P, Carbonell-Sala S, De La Vega FM, Faial T, Frankish A, Gingeras T, Guigo R, Harrow JL, Hatzigeorgiou AG, and Johnson R  | date = 2023 | title = The status of the human gene catalogue | journal = Nature | volume = 622 | issue = 7981 | pages = 41–47 | doi = 10.1038/s41586-023-06490-x | pmid = 37794265 | pmc = 10575709 | arxiv = 2303.13996 | bibcode = 2023Natur.622...41A }}</ref> These genes contain an average of 10 introns and the average size of an intron is about 6 kb (6,000 bp).<ref name="Piovesanetal2019"/> This means that the average size of a protein-coding gene is about 62 kb and these genes take up about 40% of the genome.<ref>{{cite journal | vauthors = Francis WR, Wörheide G | title = Similar Ratios of Introns to Intergenic Sequence across Animal Genomes | journal = Genome Biology and Evolution | volume = 9 | issue = 6 | pages = 1582–1598 | date = June 2017 | pmid = 28633296 | pmc = 5534336 | doi = 10.1093/gbe/evx103 }}</ref>

Exon sequences consist of coding DNA and untranslated regions (UTRs) at either end of the mature mRNA. The total amount of coding DNA is about 1-2% of the genome.<ref name = Hatje_et_al_2019>{{cite journal | vauthors = Hatje K, Mühlhausen S, Simm D, Killmar M | title =  The Protein-Coding Human Genome: Annotating High-Hanging Fruits. | year = 2019 | journal = BioEssays | volume = 41 | issue = 11 | pages = 1900066 | doi = 10.1002/bies.201900066| pmid = 31544971 }}</ref><ref name= Piovesanetal2019>{{cite journal | vauthors = Piovesan A, Antonaros F, Vitale L, Strippoli P, Pelleri MC, Caracausi M | title = Human protein-coding genes and gene feature statistics in 2019 | year = 2019 | journal = BMC Research Notes | volume =12 | issue = 1 | pages = 315 | doi = 10.1186/s13104-019-4343-8| doi-access = free | pmid = 31164174 | pmc = 6549324 }}</ref>

Many people divide the genome into coding and non-coding DNA based on the idea that coding DNA is the most important functional component of the genome. About 98-99% of the human genome is non-coding DNA.

=== Non-coding genes ===
{{Main|Non-coding RNA|Non-coding DNA}}

Noncoding RNA molecules play many essential roles in cells, especially in the many reactions of [[Translation (biology)|protein synthesis]] and [[Post-transcriptional modification|RNA processing]]. Noncoding genes include those for [[tRNA]]s, [[Ribosome|ribosomal]] RNAs, [[microRNA]]s, [[snRNA]]s and [[lncRNA|long non-coding RNA]]s (lncRNAs).<ref name="ENCODEScience">{{cite journal | vauthors = Pennisi E | title = Genomics. ENCODE project writes eulogy for junk DNA | journal = Science | volume = 337 | issue = 6099 | pages = 1159–1161 | date = Sep 2012 | pmid = 22955811 | doi = 10.1126/science.337.6099.1159 }}</ref><ref>{{cite journal | vauthors = Iyer MK, Niknafs YS, Malik R, Singhal U, Sahu A, Hosono Y, Barrette TR, Prensner JR, Evans JR, Zhao S, Poliakov A, Cao X, Dhanasekaran SM, Wu YM, Robinson DR, Beer DG, Feng FY, Iyer HK, Chinnaiyan AM | title = The landscape of long noncoding RNAs in the human transcriptome | journal = Nature Genetics | volume = 47 | issue = 3 | pages = 199–208 | date = Mar 2015 | pmid = 25599403 | doi = 10.1038/ng.3192 | pmc=4417758}}</ref><ref>{{cite journal | vauthors = Eddy SR | title = Non-coding RNA genes and the modern RNA world | journal = Nature Reviews Genetics | volume = 2 | issue = 12 | pages = 919–929 | date = Dec 2001 | pmid = 11733745 | doi = 10.1038/35103511 | s2cid = 18347629 }}</ref><ref name="MarkelManagadze2013">{{cite journal | vauthors = Managadze D, Lobkovsky AE, Wolf YI, Shabalina SA, Rogozin IB, Koonin EV | title = The vast, conserved mammalian lincRNome | journal = PLOS Computational Biology | volume = 9 | issue = 2 | pages = e1002917 | year = 2013 | pmid = 23468607 | doi = 10.1371/journal.pcbi.1002917 | pmc=3585383| bibcode = 2013PLSCB...9E2917M | doi-access = free }}</ref> The number of reported non-coding genes continues to rise slowly but the exact number in the human genome is yet to be determined. Many RNAs are thought to be non-functional.<ref name="PalazzoLee2015">{{cite journal | vauthors = Palazzo AF, Lee ES | title = Non-coding RNA: what is functional and what is junk? | journal = Frontiers in Genetics | volume = 6 | pages = 2 | year = 2015 | pmid = 25674102 | doi = 10.3389/fgene.2015.00002 | pmc=4306305| doi-access = free }}</ref>

Many ncRNAs are critical elements in gene regulation and expression. Noncoding RNA also contributes to epigenetics, transcription, RNA splicing, and the translational machinery. The role of RNA in genetic regulation and disease offers a new potential level of unexplored genomic complexity.<ref>{{cite journal | vauthors = Mattick JS, Makunin IV | title = Non-coding RNA | journal = Human Molecular Genetics | volume = 15 | issue = Spec No 1 | pages = R17–29 | date = Apr 2006 | pmid = 16651366 | doi = 10.1093/hmg/ddl046 | doi-access = free }}</ref>

=== Pseudogenes ===
{{Main|Pseudogene}}

Pseudogenes are inactive copies of protein-coding genes, often generated by [[gene duplication]], that have become nonfunctional through the accumulation of inactivating mutations. The number of pseudogenes in the human genome is on the order of 13,000,<ref name="Pei2012">{{cite journal | vauthors = Pei B, Sisu C, Frankish A, Howald C, Habegger L, Mu XJ, Harte R, Balasubramanian S, Tanzer A, Diekhans M, Reymond A, Hubbard TJ, Harrow J, Gerstein MB | title = The GENCODE pseudogene resource | journal = Genome Biology | volume = 13 | issue = 9 | pages = R51 | year = 2012 | pmid = 22951037 | pmc = 3491395 | doi = 10.1186/gb-2012-13-9-r51 | doi-access = free }}</ref> and in some chromosomes is nearly the same as the number of functional protein-coding genes. Gene duplication is a major mechanism through which new genetic material is generated during [[molecular evolution]].

For example, the [[olfactory receptor]] gene family is one of the best-documented examples of pseudogenes in the human genome. More than 60 percent of the genes in this family are non-functional pseudogenes in humans. By comparison, only 20 percent of genes in the mouse olfactory receptor gene family are pseudogenes. Research suggests that this is a species-specific characteristic, as the most closely related primates all have proportionally fewer pseudogenes. This genetic discovery helps to explain the less acute sense of smell in humans relative to other mammals.<ref>{{cite journal | vauthors = Gilad Y, Man O, Pääbo S, Lancet D | title = Human specific loss of olfactory receptor genes | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 100 | issue = 6 | pages = 3324–3327 | date = Mar 2003 | pmid = 12612342 | pmc = 152291 | doi = 10.1073/pnas.0535697100 | bibcode = 2003PNAS..100.3324G | doi-access = free }}</ref>

=== Regulatory DNA sequences ===<!-- This section is linked from Human genome -->
The human genome has many different [[Regulatory regions|regulatory sequences]] which are crucial to controlling [[gene expression]]. Conservative estimates indicate that these sequences make up 8% of the genome,<ref name="Bernstein_2012">{{cite journal | vauthors = Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M | title = An integrated encyclopedia of DNA elements in the human genome | journal = Nature | volume = 489 | issue = 7414 | pages = 57–74 | date = Sep 2012 | pmid = 22955616 | pmc = 3439153 | doi = 10.1038/nature11247 | bibcode = 2012Natur.489...57T }}</ref> however extrapolations from the [[ENCODE]] project give that 20<ref>{{cite web | vauthors = Birney E | title =ENCODE: My own thoughts | url = http://genomeinformatician.blogspot.ca/2012/09/encode-my-own-thoughts.html | work = Ewan's Blog: Bioinformatician at large | date =  5 September 2012 }}</ref> or more<ref name="pmid22955972">{{cite journal | vauthors = Stamatoyannopoulos JA | title = What does our genome encode? | journal = Genome Research | volume = 22 | issue = 9 | pages = 1602–1611 | date = Sep 2012 | pmid = 22955972 | pmc = 3431477 | doi = 10.1101/gr.146506.112 }}</ref> of the genome is gene regulatory sequence. Some types of non-coding DNA are genetic "switches" that do not encode proteins, but do regulate when and where genes are expressed (called [[enhancer (genetics)|enhancers]]).<ref>{{cite journal | vauthors = Carroll SB, Gompel N, Prudhomme B | date = May 2008 | title = Regulating Evolution | journal = Scientific American | volume = 298 | issue = 5 | pages = 60–67 | doi = 10.1038/scientificamerican0508-60 | pmid = 18444326 | bibcode = 2008SciAm.298e..60C }}</ref>

Regulatory sequences have been known since the late 1960s.<ref name="MillerIppen1968">{{cite journal | vauthors = Miller JH, Ippen K, Scaife JG, Beckwith JR | title = The promoter-operator region of the lac operon of Escherichia coli | journal = J. Mol. Biol. | volume = 38 | issue = 3 | pages = 413–420 | year = 1968 | pmid = 4887877 | doi = 10.1016/0022-2836(68)90395-1 }}</ref> The first identification of regulatory sequences in the human genome relied on recombinant DNA technology.<ref name="WrightRosenthal1984">{{cite journal | vauthors = Wright S, Rosenthal A, Flavell R, Grosveld F | title = DNA sequences required for regulated expression of beta-globin genes in murine erythroleukemia cells | journal = Cell | volume = 38 | issue = 1 | pages = 265–273 | year = 1984 | pmid = 6088069 | doi = 10.1016/0092-8674(84)90548-8 | s2cid = 34587386 }}</ref> Later with the advent of genomic sequencing, the identification of these sequences could be inferred by evolutionary conservation. The evolutionary branch between the [[primates]] and [[mouse]], for example, occurred 70–90 million years ago.<ref>{{cite journal | vauthors = Nei M, Xu P, Glazko G | title = Estimation of divergence times from multiprotein sequences for a few mammalian species and several distantly related organisms | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 98 | issue = 5 | pages = 2497–2502 | date = Feb 2001 | pmid = 11226267 | pmc = 30166 | doi = 10.1073/pnas.051611498 | bibcode = 2001PNAS...98.2497N | doi-access = free }}</ref> So computer comparisons of gene sequences that identify [[conserved non-coding sequence]]s will be an indication of their importance in duties such as gene regulation.<ref>{{cite journal | vauthors = Loots GG, Locksley RM, Blankespoor CM, Wang ZE, Miller W, Rubin EM, Frazer KA | title = Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons | journal = Science | volume = 288 | issue = 5463 | pages = 136–140 | date = Apr 2000 | pmid = 10753117 | doi = 10.1126/science.288.5463.136 | bibcode = 2000Sci...288..136L }}
[http://www.lbl.gov/Science-Articles/Archive/mouse-dna-model.html Summary] {{Webarchive|url=https://web.archive.org/web/20091106091608/http://www.lbl.gov/Science-Articles/Archive/mouse-dna-model.html |date=6 November 2009 }}</ref>

Other genomes have been sequenced with the same intention of aiding conservation-guided methods, for exampled the [[pufferfish]] genome.<ref>{{Cite web| vauthors = Meunier M | url = http://www.cns.fr/externe/English/Actualites/Presse/261001_1.html | title = Genoscope and Whitehead announce a high sequence coverage of the Tetraodon nigroviridis genome | publisher = Genoscope | access-date = 12 September 2006 | archive-url = https://web.archive.org/web/20061016085223/http://www.cns.fr/externe/English/Actualites/Presse/261001_1.html <!-- Bot retrieved archive --> | archive-date = 16 October 2006}}</ref> However, regulatory sequences disappear and re-evolve during evolution at a high rate.<ref name="pmid22705669">{{cite journal | vauthors = Romero IG, Ruvinsky I, Gilad Y | title = Comparative studies of gene expression and the evolution of gene regulation | journal = Nature Reviews Genetics | volume = 13 | issue = 7 | pages = 505–516 | date = Jul 2012 | pmid = 22705669 | doi = 10.1038/nrg3229 | pmc = 4034676 }}</ref><ref name="pmid20378774">{{cite journal | vauthors = Schmidt D, Wilson MD, Ballester B, Schwalie PC, Brown GD, Marshall A, Kutter C, Watt S, Martinez-Jimenez CP, Mackay S, Talianidis I, Flicek P, Odom DT | title = Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding | journal = Science | volume = 328 | issue = 5981 | pages = 1036–1040 | date = May 2010 | pmid = 20378774 | pmc = 3008766 | doi = 10.1126/science.1186176 | bibcode = 2010Sci...328.1036S }}</ref><ref name="pmid18787134">{{cite journal | vauthors = Wilson MD, Barbosa-Morais NL, Schmidt D, Conboy CM, Vanes L, Tybulewicz VL, Fisher EM, Tavaré S, Odom DT | title = Species-specific transcription in mice carrying human chromosome 21 | journal = Science | volume = 322 | issue = 5900 | pages = 434–438 | date = Oct 2008 | pmid = 18787134 | pmc = 3717767 | doi = 10.1126/science.1160930 | bibcode = 2008Sci...322..434W }}</ref>

As of 2012, the efforts have shifted toward finding interactions between DNA and regulatory proteins by the technique [[ChIP-Seq]], or gaps where the DNA is not packaged by [[histone]]s ([[hypersensitive site|DNase hypersensitive sites]]), both of which tell where there are active regulatory sequences in the investigated cell type.<ref name="Bernstein_2012"/>

=== Repetitive DNA sequences ===

[[Repetitive DNA|Repetitive DNA sequences]] comprise approximately 50% of the human genome.<ref>{{cite journal | vauthors = Treangen TJ, Salzberg SL | title = Repetitive DNA and next-generation sequencing: computational challenges and solutions | journal = Nature Reviews Genetics | volume = 13 | issue = 1 | pages = 36–46 | date = Jan 2012 | pmid = 22124482 | pmc = 3324860 | doi = 10.1038/nrg3117 }}</ref>

About 8% of the human genome consists of tandem DNA arrays or tandem repeats, low complexity repeat sequences that have multiple adjacent copies (e.g. "CAGCAGCAG...").<ref>{{cite journal | vauthors = Duitama J, Zablotskaya A, Gemayel R, Jansen A, Belet S, Vermeesch JR, Verstrepen KJ, Froyen G | title = Large-scale analysis of tandem repeat variability in the human genome | journal = Nucleic Acids Research | volume = 42 | issue = 9 | pages = 5728–5741 | date = May 2014 | pmid = 24682812 | pmc = 4027155 | doi = 10.1093/nar/gku212 }}</ref> The tandem sequences may be of variable lengths, from two nucleotides to tens of nucleotides. These sequences are highly variable, even among closely related individuals, and so are used for [[genealogical DNA testing]] and [[forensic DNA|forensic DNA analysis]].<ref>{{cite book| vauthors = Pierce BA |title=Genetics : a conceptual approach|date=2012|publisher=W.H. Freeman|location=New York|isbn=978-1-4292-3250-0|pages=538–540|edition=4th}}</ref>

Repeated sequences of fewer than ten nucleotides (e.g. the dinucleotide repeat (AC)<sub>n</sub>) are termed microsatellite sequences. Among the microsatellite sequences, trinucleotide repeats are of particular importance, as sometimes occur within [[coding region]]s of genes for proteins and may lead to genetic disorders. For example, Huntington's disease results from an expansion of the trinucleotide repeat (CAG)<sub>n</sub> within the ''[[Huntingtin]]'' gene on human chromosome 4. [[Telomeres]] (the ends of linear chromosomes) end with a microsatellite hexanucleotide repeat of the sequence (TTAGGG)<sub>n</sub>.{{citation needed|date=March 2023}}

Tandem repeats of longer sequences (arrays of repeated sequences 10–60 nucleotides long) are termed [[minisatellite]]s.<ref>{{Cite web |title=minisatellite, n. meanings, etymology and more {{!}} Oxford English Dictionary |url=https://www.oed.com/dictionary/minisatellite_n?tab=meaning_and_use&tl=true |access-date=2023-10-08 |website=www.oed.com}}</ref>

[[Transposable element|Transposable genetic elements]], DNA sequences that can replicate and insert copies of themselves at other locations within a host genome, are an abundant component in the human genome. The most abundant transposon lineage, ''Alu'', has about 50,000 active copies,<ref name="pmid18836035">{{cite journal | vauthors = Bennett EA, Keller H, Mills RE, Schmidt S, Moran JV, Weichenrieder O, Devine SE | title = Active Alu retrotransposons in the human genome | journal = Genome Research | volume = 18 | issue = 12 | pages = 1875–1883 | date = Dec 2008 | pmid = 18836035 | pmc = 2593586 | doi = 10.1101/gr.081737.108 }}</ref> and can be inserted into intragenic and intergenic regions.<ref name=Liang2013>{{cite journal | vauthors = Liang KH, Yeh CT | title = A gene expression restriction network mediated by sense and antisense Alu sequences located on protein-coding messenger RNAs | journal = BMC Genomics | volume = 14 | pages = 325 | pmid = 23663499 | pmc = 3655826 | doi = 10.1186/1471-2164-14-325 | year=2013 | doi-access = free }}</ref> One other lineage, LINE-1, has about 100 active copies per genome (the number varies between people).<ref name="pmid12682288">{{cite journal | vauthors = Brouha B, Schustak J, Badge RM, Lutz-Prigge S, Farley AH, Moran JV, Kazazian HH | title = Hot L1s account for the bulk of retrotransposition in the human population | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 100 | issue = 9 | pages = 5280–5285 | date = Apr 2003 | pmid = 12682288 | pmc = 154336 | doi = 10.1073/pnas.0831042100 | bibcode = 2003PNAS..100.5280B | doi-access = free }}</ref> Together with non-functional relics of old transposons, they account for over half of total human DNA.<ref>{{cite book | vauthors = Barton NH, Briggs DE, Eisen JA, Goldstein DB, Patel NH | title = Evolution | date = 2007 | publisher = Cold Spring Harbor Laboratory Press | location = Cold Spring Harbor, NY | isbn = 978-0-87969-684-9 }}{{page needed|date=April 2023}}</ref> Sometimes called "jumping genes", transposons have played a major role in sculpting the human genome. Some of these sequences represent [[endogenous retroviruses]], DNA copies of viral sequences that have become permanently integrated into the genome and are now passed on to succeeding generations. There are also a significant number of [[Human endogenous retrovirus|retroviruses in human DNA]], at least 3 of which have been proven to possess an important function (i.e., [[HIV]]-like functional HERV-K; envelope genes of non-functional viruses HERV-W and HERV-FRD play a role in placenta formation by inducing cell-cell fusion).

Mobile elements within the human genome can be classified into [[Retrotransposon#LTR retrotransposons|LTR retrotransposons]] (8.3% of total genome), [[short interspersed nuclear element|SINEs]] (13.1% of total genome) including [[Alu elements]], [[long interspersed nuclear element|LINEs]] (20.4% of total genome), SVAs (SINE-[[Variable number tandem repeat|VNTR]]-Alu) and [[Transposable element#Classification|Class II DNA transposons]] (2.9% of total genome).

=== Junk DNA ===
{{Main|Junk DNA}}

There is no consensus on what constitutes a "functional" element in the genome since geneticists, evolutionary biologists, and molecular biologists employ different definitions and methods.<ref name="kellis">{{cite journal | vauthors = Kellis M, Wold B, Snyder MP, Bernstein BE, Kundaje A, Marinov GK, Ward LD, Birney E, Crawford GE, Dekker J, Dunham I, Elnitski LL, Farnham PJ, Feingold EA, Gerstein M, Giddings MC, Gilbert DM, Gingeras TR, Green ED, Guigo R, Hubbard T, Kent J, Lieb JD, Myers RM, Pazin MJ, Ren B, Stamatoyannopoulos JA, Weng Z, White KP, Hardison RC | title = Defining functional DNA elements in the human genome | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 111 | issue = 17 | pages = 6131–6138 | date = April 2014 | pmid = 24753594 | doi = 10.1073/pnas.1318948111 | pmc = 4035993 | bibcode = 2014PNAS..111.6131K | doi-access = free }}</ref><ref>{{cite journal | vauthors = Linquist S, Doolittle WF, Palazzo AF | title = Getting clear about the F-word in genomics | journal = PLOS Genetics | volume = 16 | issue = 4 | pages = e1008702 | date = April 2020 | pmid = 32236092 | pmc = 7153884 | doi = 10.1371/journal.pgen.1008702 | doi-access = free }}</ref> Due to the ambiguity in the terminology, different schools of thought have emerged.<ref>{{cite journal | vauthors = Doolittle WF | title = We simply cannot go on being so vague about 'function' | journal = Genome Biology | volume = 19 | issue = 1 | pages = 223 | date = December 2018 | pmid = 30563541 | pmc = 6299606 | doi = 10.1186/s13059-018-1600-4 | doi-access = free }}</ref> In evolutionary definitions, "functional" DNA, whether it is coding or non-coding, contributes to the fitness of the organism, and therefore is maintained by negative [[evolutionary pressure]] whereas "non-functional" DNA has no benefit to the organism and therefore is under neutral selective pressure. This type of DNA has been described as [[junk DNA]].<ref name = "Graur_2017">{{cite book | vauthors = Graur D | chapter = Rubbish DNA: the functionless fraction of the human genome. | doi = 10.1007/978-4-431-56603-8_2 | title = Evolution of the Human Genome I | series = Evolutionary Studies | date = 2017 | pages = 19–60 | publisher = Springer | location = Tokyo | arxiv = 1601.06047 | isbn = 978-4-431-56603-8 | s2cid = 17826096 }}</ref><ref name = "Pena_2021">{{cite book | vauthors = Pena SD | chapter = An Overview of the Human Genome: Coding DNA and Non-Coding DNA |veditors = Haddad LA |title=Human Genome Structure, Function and Clinical Considerations |date=2021 |publisher=Springer Nature |location=Cham |isbn=978-3-03-073151-9 |pages=5–7 | chapter-url=https://books.google.com/books?id=cTYyEAAAQBAJ&dq=junk+DNA+controversy&pg=PA5}}</ref> In genetic definitions, "functional" DNA is related to how DNA segments manifest by phenotype and "nonfunctional" is related to loss-of-function effects on the organism.<ref name="kellis" /> In biochemical definitions, "functional" DNA relates to DNA sequences that specify molecular products (e.g. noncoding RNAs) and biochemical activities with mechanistic roles in gene or genome regulation (i.e. DNA sequences that impact cellular level activity such as cell type, condition, and molecular processes).<ref>{{cite journal | vauthors =Abascal F, Acosta R, Addleman NJ, Adrian J, et al. |title=Expanded Encyclopaedias of DNA elements in the Human and Mouse Genomes |journal=Nature |date=30 July 2020 |volume=583 |issue=7818 |pages=699–710 |doi=10.1038/s41586-020-2493-4|pmid=32728249 |pmc=7410828 |bibcode=2020Natur.583..699E | quote= Operationally, functional elements are defined as discrete, linearly ordered sequence features that specify molecular products (for example, protein-coding genes or noncoding RNAs) or biochemical activities with mechanistic roles in gene or genome regulation (for example, transcriptional promoters or enhancers).}}</ref><ref name="kellis" /> There is no consensus in the literature on the amount of functional DNA since, depending on how "function" is understood, ranges have been estimated from up to 90% of the human genome is likely nonfunctional DNA (junk DNA)<ref>{{cite journal | vauthors = Graur D | title = An Upper Limit on the Functional Fraction of the Human Genome | journal = Genome Biology and Evolution | volume = 9 | issue = 7 | pages = 1880–1885 | date = July 2017 | pmid = 28854598 | pmc = 5570035 | doi = 10.1093/gbe/evx121 }}{{lay source |template=cite news | vauthors = Le Page M |url=https://www.newscientist.com/article/2140926-at-least-75-per-cent-of-our-dna-really-is-useless-junk-after-all/ |title=At least 75 per cent of our DNA really is useless junk after all |date= 17 July 2017 |work= NewScientist }}</ref> to up to 80% of the genome is likely functional.<ref name=Nature489p57>{{cite journal | vauthors = Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis CA, Doyle F, et. al | collaboration = The ENCODE Project Consortium | title = An integrated encyclopedia of DNA elements in the human genome | journal = Nature | volume = 489 | issue = 7414 | pages = 57–74 | date = September 2012 | pmid = 22955616 | pmc = 3439153 | doi = 10.1038/nature11247 | bibcode = 2012Natur.489...57T | quote = These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions.}}.</ref> It is also possible that junk DNA may acquire a function in the future and therefore may play a role in evolution,<ref name="pmid16237443">{{cite journal | vauthors = Andolfatto P | title = Adaptive evolution of non-coding DNA in Drosophila | journal = Nature | volume = 437 | issue = 7062 | pages = 1149–52 | date = October 2005 | pmid = 16237443 | doi = 10.1038/nature04107 | bibcode = 2005Natur.437.1149A | s2cid = 191219 }} {{lay source |template=cite news |url=https://www.sciencedaily.com/releases/2005/10/051020090946.htm |title=UCSD Study Shows 'Junk' DNA Has Evolutionary Importance |date= 20 October 2005 |work=ScienceDaily |location=Rockville, MD}}</ref> but this is likely to occur only very rarely.<ref name = "Graur_2017" /> Finally DNA that is deliterious to the organism and is under negative selective pressure is called garbage DNA.<ref name = "Pena_2021" />