Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Lossless compression
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Genetics and genomics === [[Compression of genomic sequencing data|Genetics compression algorithms]] (not to be confused with [[genetic algorithm]]s) are the latest generation of lossless algorithms that compress data (typically sequences of nucleotides) using both conventional compression algorithms and specific algorithms adapted to genetic data. In 2012, a team of scientists from Johns Hopkins University published the first genetic compression algorithm that does not rely on external genetic databases for compression. HAPZIPPER was tailored for [[International_HapMap_Project|HapMap]] data and achieves over 20-fold compression (95% reduction in file size), providing 2- to 4-fold better compression much faster than leading general-purpose compression utilities.<ref>{{cite journal |author=Chanda, P. |author2=Elhaik, E. |author3=Bader, J.S. | title=HapZipper: sharing HapMap populations just got easier | journal=Nucleic Acids Res | pages=1β7 | year=2012 | pmid=22844100 | doi=10.1093/nar/gks709 | volume=40 | issue=20 | pmc=3488212}}</ref> Genomic sequence compression algorithms, also known as DNA sequence compressors, explore the fact that DNA sequences have characteristic properties, such as inverted repeats. The most successful compressors are XM and GeCo.<ref name=Pratas>{{cite book |last1=Pratas |first1=D. |last2=Pinho |first2=A. J. |last3=Ferreira |first3=P. J. S. G. |date=2016 |chapter=Efficient compression of genomic sequences |title=Data Compression Conference |location=Snowbird, Utah |url=http://sweet.ua.pt/pratas/papers/Pratas-2016b.pdf}}</ref> For [[eukaryotes]] XM is slightly better in compression ratio, though for sequences larger than 100 MB its computational requirements are impractical.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Lossless compression
(section)
Add topic