The next-generation of DNA sequencing: understanding modern genomics technologies

academics biology genomics
By Alex P.

On October 21st, 2004, the International Human Genome Sequencing Consortium published a near-complete draft of the human genome, a 100 million dollar initiative to understand the genetics of our species. By 2022, the cost to sequence the human genome neared $1000. This drastic price reduction has led to new advancements in understanding cellular function, disease, and personalized medicine. How did this revolutionary cost decrease occur? While many innovations contributed to this drastic price decline, today we’ll explain “Next Generation Sequencing (NGS),” one such technological contribution. 

This blog post will assume a basic familiarity with the structure of DNA and polymerase chain reaction (PCR). For a primer into DNA and PCR, I recommend Khan Academy’s excellent intros.

Illumina (Solexa) sequencing: 

In this form of “sequencing-by-synthesis”, DNA is first cut into small pieces of 100-1000 base pairs (the exact length may depend on experiment specifics). A scientist then adds a short “adapter” to the end of DNA — i.e. a string of nucleotides with a predetermined sequence. This technology is called adapter ligation, and is frequently used in DNA sequencing preparation. From there, the two strands of DNA are split apart and washed across a “flow cell”, a glass slip covered in short strands of DNA that are complementary to those added in adapter ligation. These short strands bind to the adapters and immobilize the DNA. A primer is then attached to the adapter region, and a PCR-like process begins where a polymerase extends the DNA. Unlike in PCR, here a chemically-modified nucleotide bound to a fluorophore (a compound that can emit light) is used, preventing any new bases from being added and prematurely halting the formation of new DNA. A camera takes an image of the glowing fluorophore attached to the DNA base, and then the fluorophore and chemical modifications are removed, allowing the addition of further base pairs to the growing DNA strand. This process is repeated over and over such that an image of each base pair is acquired. Because each fluorophore is unique to the nucleotide it is attached to, one can therefore infer the sequence of a DNA strand using this series of acquired images!

Roche 454 sequencing (pyrosequencing): 

While Illumina’s sequencing method is now the industry standard, through the 2000’s to early 2010’s, pyrosequencing held significant sequencing market-share. This technique uses a similar adapter-ligation and sequencing-by-synthesis principle. Instead of washing 100-1000 base pair-long DNA across a “flow cell”, here DNA is attached to tiny resin beads covered in short DNA strands that complement those added in adapter ligation. These beads are emulsified in oil such that, statistically, only one DNA strand should become trapped in a droplet with a single bead. These DNA strands are amplified by a process called emulsion PCR, where each bead and DNA-strand undergoes PCR in a single oil-surrounded droplet. The beads are then filtered to remove any that failed to attach to a DNA fragment and moved to a sequencing plate full of wells that hold one bead each. Similar to Illumina's process, a primer that attaches to the adapter region is then added, and a PCR-like process begins where one nucleotide type (A,T,C, or G) is poured across the sequencing plate at a time. If the DNA can bind, it releases a pyrophosphate, which is combined with adenylyl sulfate and converted to ATP via the enzyme ATP sulfurylase. By adding the enzyme luciferase, which uses ATP to produce light, a fluorescent signal can be produced. After a nucleotide type is washed across the plate, excess nucleotides are removed by the enzyme apyrase, and another, different nucleotide is added. By repeatedly flowing different nucleotides across the sequencing plate and detecting light signals via a camera, the sequence of DNA can be decoded! Because of the iterative nature of this technology, 454 sequencing is comparably slower than Illumina sequencing, and has difficulty decoding strings of repeated base pairs (ex. AAAA or GGGG). By mid-2016, production of 454 sequencers was halted thanks to market non-competitivity. 

Ion Torrent sequencing: 

Similar to 454 pyrosequencing, in Ion Torrent’s technology, a bead-based, emulsion PCR preparation is used. Instead of using an imaging-based system, however, here DNA-covered beads are instead bound to a semiconductor plate. Different nucleotides are washed across a plate one at a time along with the prerequisite primer and polymerase, and when a correct nucleotide is incorporated, a hydrogen ion is released. This changes the solution pH, which the semiconductor detects. The resulting voltage change allows one to track nucleotide identity. When multiple adjacent nucleotides are next to each other, multiple hydrogen ions are released, causing an extra voltage spike; this allows long runs of repeated base pairs (ex. AAAA or GGGG) to be determined. Since no camera is needed in this technology, this process is typically faster than pyrosequencing. 

Sources:

Bharagava, R. N., Purchase, D., Saxena, G., & Mulla, S. I. (2019). Applications of metagenomics in microbial bioremediation of pollutants. Microbial Diversity in the Genomic Era, 459–477. https://doi.org/10.1016/b978-0-12-814849-5.00026-5

International Human Genome Sequencing Consortium. (2004). Finishing the euchromatic sequence of the human genome. Nature, 431(7011), 931–945. https://doi.org/10.1038/nature03001 

National Human Genome Research Institute. (2021, November 1). The cost of sequencing a human genome. Genome.gov. Retrieved January 29, 2023, from https://www.genome.gov/about-genomics/fact-sheets/Sequencing-Human-Genome-cost 

Mashayekhi, F., & Ronaghi, M. (2007). Analysis of read length limiting factors in pyrosequencing chemistry. Analytical Biochemistry, 363(2), 275–287. https://doi.org/10.1016/j.ab.2007.02.002

Public Engagement Team at Wellcome Genome Campus. (2021, July 21). What is the 454 method of DNA sequencing? yourgenome. Retrieved January 29, 2023, from https://www.yourgenome.org/facts/what-is-the-454-method-of-dna-sequencing/

Slatko, B. E., Gardner, A. F., & Ausubel, F. M. (2018). Overview of next‐generation sequencing technologies. Current Protocols in Molecular Biology, 122(1). https://doi.org/10.1002/cpmb.59

Comments

topicTopics
academics study skills MCAT medical school admissions SAT expository writing college admissions English MD/PhD admissions strategy writing LSAT GMAT GRE physics chemistry math biology graduate admissions ACT academic advice interview prep law school admissions test anxiety language learning premed MBA admissions career advice personal statements homework help AP exams creative writing MD study schedules test prep Common Application computer science summer activities history philosophy mathematics organic chemistry secondary applications economics supplements research 1L PSAT admissions coaching grammar law psychology statistics & probability legal studies ESL CARS SSAT covid-19 dental admissions logic games reading comprehension engineering USMLE calculus PhD admissions Spanish mentorship parents Latin biochemistry case coaching verbal reasoning DAT English literature STEM excel medical school political science AMCAS French Linguistics MBA coursework Tutoring Approaches academic integrity chinese letters of recommendation Anki DO Social Advocacy admissions advice algebra astrophysics business classics diversity statement genetics geometry kinematics linear algebra mechanical engineering mental health presentations quantitative reasoning skills study abroad technical interviews time management work and activities 2L DMD IB exams ISEE MD/PhD programs Sentence Correction adjusting to college algorithms amino acids analysis essay art history artificial intelligence athletics business skills careers cold emails data science dental school finance first generation student functions gap year information sessions international students internships logic networking poetry resume revising science social sciences software engineering tech industry trigonometry writer's block 3L AAMC Academic Interest EMT FlexMed Fourier Series Greek Health Professional Shortage Area Italian Lagrange multipliers London MD vs PhD MMI Montessori National Health Service Corps Pythagorean Theorem Python Shakespeare Step 2 TMDSAS Taylor Series Truss Analysis Zoom acids and bases active learning architecture argumentative writing art art and design schools art portfolios bibliographies biomedicine brain teaser campus visits cantonese capacitors capital markets cell biology central limit theorem centrifugal force chemical engineering chess chromatography class participation climate change clinical experience community service constitutional law consulting cover letters curriculum dementia demonstrated interest dimensional analysis distance learning econometrics electric engineering electricity and magnetism escape velocity evolution executive function freewriting genomics graphing harmonics health policy history of medicine history of science hybrid vehicles hydrophobic effect ideal gas law immunology induction infinite institutional actions integrated reasoning intermolecular forces intern investing investment banking lab reports linear maps mandarin chinese matrices mba medical physics meiosis microeconomics mitosis mnemonics music music theory nervous system neurology neuroscience object-oriented programming office hours operating systems organization