Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Genomics
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Assembly === {{Main|Sequence assembly}} {{multiple image | direction = vertical | align = right | width = 300 | image1 =PET contig scaffold.png | caption1 = Overlapping reads form contigs; contigs and gaps of known length form scaffolds. | image2 = Mapping Reads.png | caption2 = Paired end reads of next generation sequencing data mapped to a reference genome. | footer = Multiple, fragmented sequence reads must be assembled together on the basis of their overlapping areas. }} Sequence assembly refers to [[sequence alignment|aligning]] and merging fragments of a much longer [[DNA]] sequence in order to reconstruct the original sequence.<ref name = "Pevsner_2009"/> This is needed as current [[DNA sequencing]] technology cannot read whole genomes as a continuous sequence, but rather reads small pieces of between 20 and 1000 bases, depending on the technology used. Third generation sequencing technologies such as PacBio or Oxford Nanopore routinely generate sequencing reads 10-100 kb in length; however, they have a high error rate at approximately 1 percent.<ref name = "PacBio" /><ref name = "nanoporetech" /> Typically the short fragments, called reads, result from [[shotgun sequencing]] [[genome|genomic]] DNA, or [[Transcription (genetics)|gene transcripts]] ([[expressed sequence tag|ESTs]]).<ref name = "Pevsner_2009"/> ==== Assembly approaches ==== Assembly can be broadly categorized into two approaches: ''de novo'' assembly, for genomes which are not similar to any sequenced in the past, and comparative assembly, which uses the existing sequence of a closely related organism as a reference during assembly.<ref name = "Pop_2008"/> Relative to comparative assembly, ''de novo'' assembly is computationally difficult ([[NP-hard]]), making it less favourable for short-read NGS technologies. Within the ''de novo'' assembly paradigm there are two primary strategies for assembly, Eulerian path strategies, and overlap-layout-consensus (OLC) strategies. OLC strategies ultimately try to create a Hamiltonian path through an overlap graph which is an NP-hard problem. Eulerian path strategies are computationally more tractable because they try to find a Eulerian path through a deBruijn graph.<ref name = "Pop_2008"/> ==== Finishing ==== Finished genomes are defined as having a single contiguous sequence with no ambiguities representing each [[Replicon (genetics)|replicon]].<ref name = "Chain_2009"/>
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Genomics
(section)
Add topic