In genetics and biochemistry, sequencing means to determine the primary structure (or primary sequence) of an unbranched biopolymer. Genetics (from Ancient Greek grc-Latn genetikos, “genitive” and that from grc-Latn genesis, “origin” a discipline of Biology, is Biochemistry is the study of the chemical processes in living Organisms It deals with the Structure and function of cellular components such as In Biochemistry, the primary structure of a biological molecule is the exact specification of its atomic composition and the chemical bonds connecting those atoms (including Biopolymers are a class of Polymers produced by living organisms Sequencing results in a symbolic linear depiction known as a sequence which succinctly summarizes much of the atomic-level structure of the sequenced molecule.
Contents |
DNA sequencing is the process of determining the nucleotide order of a given DNA fragment. The term DNA sequencing encompasses biochemical methods for determining the order of the Nucleotide bases Adenine, Guanine, Cytosine Nucleotides are Organic compounds that consist of three joined structures a nitrogenous base a Sugar, and a Phosphate group Deoxyribonucleic acid ( DNA) is a Nucleic acid that contains the genetic instructions used in the development and functioning of all known Thus far, most DNA sequencing has been performed using the chain termination method developed by Frederick Sanger. The term DNA sequencing encompasses biochemical methods for determining the order of the Nucleotide bases Adenine, Guanine, Cytosine Frederick Sanger, OM, CH, CBE, FRS (born 13 August 1918) is an English biochemist and twice This technique uses sequence-specific termination of a DNA synthesis reaction using modified nucleotide substrates. However, new sequencing technologies such as Pyrosequencing are gaining an increasing share of the sequencing market. Pyrosequencing is a method of DNA sequencing (determining the order of Nucleotides in DNA based on the "sequencing by synthesis" principle More genome data is being produced by pyrosequencing than Sanger DNA sequencing these days. Pyrosequencing has enabled rapid genome sequencing. Bacterial genome can be sequenced in a single run with several X coverage with this technique. This technique was also used to sequence the genome of James Watson recently.
The sequence of DNA encodes the necessary information for living things to survive and reproduce. Determining the sequence is therefore useful in 'pure' research into why and how organisms live, as well as in applied subjects. Because of the key nature of DNA to living things, knowledge of DNA sequence may come in useful in practically any biological research. For example, in medicine it can be used to identify, diagnose and potentially develop treatments for genetic diseases. Similarly, research into pathogens may lead to treatments for contagious diseases. A pathogen (from Greek πάθος pathos "suffering passion" and γἰγνομαι (γεν- gignomai (gen- "I give birth to" infectious Biotechnology is a burgeoning discipline, with the potential for many useful products and services. Biotechnology is Technology based on Biology, especially when used in Agriculture, Food science, and Medicine.
In chain terminator sequencing (Sanger sequencing), extension is initiated at a specific site on the template DNA by using a short oligonucleotide 'primer' complementary to the template at that region. The oligonucleotide primer is extended using a DNA polymerase, an enzyme that replicates DNA. A DNA Polymerase is an Enzyme that assists in DNA replication. Included with the primer and DNA polymerase are the four deoxynucleotide bases (DNA building blocks), along with a low concentration of a chain terminating nucleotide (most commonly a di-deoxynucleotide). Limited incorporation of the chain terminating nucleotide by the DNA polymerase results in a series of related DNA fragments that are terminated only at positions where that particular nucleotide is used. The fragments are then size-separated by electrophoresis in a slab polyacrylamide gel, or more commonly now, in a narrow glass tube (capillary) filled with a viscous polymer.
An alternative to the labelling of the primer is to label the terminators instead, commonly called 'dye terminator sequencing'. The major advantage of this approach is the complete sequencing set can be performed in a single reaction, rather than the four needed with the labeled-primer approach. This is accomplished by labelling each of the dideoxynucleotide chain-terminators with a separate fluorescent dye, which fluoresces at a different wavelength. In Physics wavelength is the distance between repeating units of a propagating Wave of a given Frequency. This method is easier and quicker than the dye primer approach, but may produce more uneven data peaks (different heights), due to a template dependent difference in the incorporation of the large dye chain-terminators. This problem has been significantly reduced with the introduction of new enzymes and dyes that minimize incorporation variability.
This method is now used for the vast majority of sequencing reactions as it is both simpler and cheaper. The major reason for this is that the primers do not have to be separately labelled (which can be a significant expense for a single-use custom primer), although this is less of a concern with frequently used 'universal' primers.
Pyrosequencing, which was originally developed by Mostafa Ronaghi, has been commercialized by Biotage (for low throughput sequencing) and 454 Life Sciences (for high-throughput sequencing). Pyrosequencing is a method of DNA sequencing (determining the order of Nucleotides in DNA based on the "sequencing by synthesis" principle The latter platform sequences roughly 100 megabases in a 7-hour run with a single machine. In the array-based method (commercialized by 454 Life Sciences), single-stranded DNA is annealed to beads and amplified via emPCR. These DNA-bound beads are then placed into wells on a fiber-optic chip along with enzymes which produce light in the presence of ATP. Enzymes are Biomolecules that catalyze ( ie increase the rates of Chemical reactions Almost all enzymes are Proteins Adenosine-5'-triphosphate ( ATP) is a multifunctional Nucleotide that is most important as a " molecular currency" of intracellular Energy When free nucleotides are washed over this chip, light is produced as ATP is generated when nucleotides join with their complementary base pairs. In Molecular biology, two Nucleotides on opposite complementary DNA or RNA strands that are connected via Hydrogen bonds are called Addition of one (or more) nucleotide(s) results in a reaction that generates a light signal that is recorded by the CCD camera in the instrument. The signal strength is proportional to the number of nucleotides, for example, homopolymer stretches, incorporated in a single nucleotide flow. [1]
RNA is less stable in the cell, and also more prone to nuclease attack experimentally. Ribonucleic acid ( RNA) is a Nucleic acid that consists of a long chain of Nucleotide units As RNA is generated by transcription from DNA, the information is already present in the cell's DNA. Transcription is the synthesis of RNA under the direction of DNA However, it is sometimes desirable to sequence RNA molecules. In particular, in Eukaryotes RNA molecules are not necessarily co-linear with their DNA template, as introns are excised. Animals Plants fungi, and Protists are eukaryotes (juːˈkærɪɒt or -oʊt Organisms whose cells are organized into complex Introns, derived from the term "intragenic regions" and also called intervening sequence (IVS are DNA regions in a Gene that are not translated into To sequence RNA, the usual method is first to reverse transcribe the sample to generate DNA fragments. In Biochemistry, a reverse transcriptase, also known as RNA-dependent DNA polymerase, is a DNA polymerase Enzyme that transcribes This can then be sequenced as described above.
Methods for performing protein sequencing include:
If the gene encoding the protein can be identified it is currently much easier to sequence the DNA and infer the protein sequence. Proteins are found in every cell and are essential to every biological process Protein structure is very complex determining a protein's structure involves first Proteins are large Organic compounds made of Amino acids arranged in a linear chain and joined together by Peptide bonds between the Carboxyl Edman degradation, developed by Pehr Edman, is a method of sequencing Amino acids in a Peptide. Peptide mass fingerprinting (PMF (also known as protein fingerprinting) is an analytical technique for Protein identification that was developed in 1993 by several Mass spectrometry is an analytical technique that identifies the chemical composition of a compound or sample based on the Mass-to-charge ratio of charged particles Determining part of a protein's amino-acid sequence (often one end) by one of the above methods may be sufficient to enable the identification of a clone carrying the gene. Molecular cloning refers to the procedure of isolating a defined DNA sequence and obtaining multiple copies of it In vivo.
Though polysaccharides are also biopolymers, it is not so common to talk of 'sequencing' a polysaccharide, for several reasons. Polysaccharides are relatively complex Carbohydrates They are Polymers made up of many Monosaccharides joined together by Glycosidic bonds Although many polysaccharides are linear, many have branches. Many different units (individual monosaccharides) can be used, and bonded in different ways. Monosaccharides (from Greek monos: single sacchar: sugar are the most basic unit of Carbohydrates They consist of one sugar and A chemical bond is the physical process responsible for the attractive interactions between Atoms and Molecules and which confers stability to diatomic and polyatomic However, the main theoretical reason is that whereas the other polymers listed here are primarily generated in a 'template-dependent' manner by one processive enzyme, each individual join in a polysaccharide may be formed by a different enzyme. Enzymes are Biomolecules that catalyze ( ie increase the rates of Chemical reactions Almost all enzymes are Proteins In many cases the assembly is not uniquely specified; depending on which enzyme acts, one of several different units may be incorporated. This can lead to a family of similar molecules being formed. This is particularly true for plant polysaccharides. Methods for the structure determination of oligosaccharides and polysaccharides include NMR spectroscopy and methylation analysis[1]. Structure determination, structural determination or structural redetermination or simply redetermination in Chemistry is the process of An oligosaccharide is a Saccharide polymer containing a small number (typically three to ten of component sugars also known as Simple sugars. Polysaccharides are relatively complex Carbohydrates They are Polymers made up of many Monosaccharides joined together by Glycosidic bonds