DNA composition structure properties. DNA molecule structure

DNA Logic is a DNA computing technology that is in its infancy today, but there are high hopes for it in the future. Biological nanocomputers, implanted in living organisms, are still seen by us as something fantastic, unreal. But what is unreal today, tomorrow may turn out to be something commonplace and so natural that it will be difficult to imagine how one could do without it in the past.

So DNA computing is a branch of the field of molecular computing on the border of molecular biology and computer science. The main idea of ​​DNA computing is the construction of a new paradigm, the creation of new computing algorithms based on knowledge about the structure and functions of the DNA molecule and operations that are performed in living cells on DNA molecules using various enzymes. The prospects for DNA computing include the creation of a biological nanocomputer that will be able to store terabytes of information with a volume of several micrometers. Such a computer can be implanted into a cell of a living organism, and its performance will be calculated in billions of operations per second with energy consumption of no more than one billionth of a watt.

Benefits of DNA in Computer Technology

For modern processors and microcircuits, silicon is used as a building material. But the possibilities of silicon are not unlimited, and eventually we will come to the point where further growth in the processing power of processors will be exhausted. Therefore, humanity is already facing an acute problem of finding new technologies and materials that could replace silicon in the future.

DNA molecules may turn out to be the very material that will subsequently replace silicon transistors with their binary logic. Suffice it to say that just one pound (453 g) of DNA molecules has storage capacity that surpasses the total capacity of all modern electronic data storage systems, and the processing power of a droplet-sized DNA processor will exceed the most powerful modern supercomputer.

More than 10 trillion DNA molecules occupy a volume of just 1 cm3. However, this number of molecules is enough to store a volume of information of 10 TB, while they can perform 10 trillion operations per second.

Another advantage of DNA processors in comparison with conventional silicon processors is that they can perform all calculations not sequentially, but in parallel, which makes it possible to perform the most complex mathematical calculations in literally a matter of minutes. Traditional computers would take months and years to complete such calculations.

DNA molecule structure

As you know, modern computers work with binary logic, which implies the presence of only two states: logical zero and one. Using a binary code, that is, a sequence of zeros and ones, you can encode any information. There are four basic bases in DNA molecules: adenine (A), guanine (G), cytosine (C) and thymine (T), linked together in a chain. That is, a DNA molecule (single strand) can have, for example, the following form: ATTTACGGCC - not binary, but quaternary logic is used here. And just as in binary logic any information can be encoded as a sequence of zeros and ones, in DNA molecules any information can be encoded by combining basic bases.

The basic bases in DNA molecules are at a distance of 0.34 nanometers from each other, which determines their enormous informative capacity - the linear density is 18 Mbit / inch. If we talk about the surface informative density, assuming that one base base has an area of ​​1 square nanometer, then it is more than a million gigabits per square inch. For comparison, note that the surface recording density of modern hard drives is about 7 Gb / inch 2.

Another important property of DNA molecules is that they can be in the form of a regular double helix, the diameter of which is only 2 nm. Such a spiral consists of two chains (sequences of basic bases), and the content of the first chain strictly corresponds to the content of the second.

This correspondence is achieved due to the presence of hydrogen bonds between the bases of two chains directed towards each other - in pairs G and C or A and T. Describing this property of a double helix, molecular biologists say that DNA chains are complementary due to the formation of G-C and A-T pairs.

For example, if the sequence S is written as ATTACGTCG, then the complementary S 'sequence will have the form TAATGCAGC.

The process of joining two single strands of DNA by linking complementary bases into a regular double helix is ​​called renaturation, and the reverse process, that is, the separation of a double strand and obtaining two single strands, is called denaturation (Fig. 1).

Rice. 1. Processes of renaturation and denaturation

The complementary feature of the structure of DNA molecules can be used in DNA calculations. For example, based on complementary sequences, you can implement a powerful error correction mechanism, which is somewhat reminiscent of the RAID Level 1 data mirroring technology.

Basic operations on DNA molecules

Various enzymes (enzymes) are used to manipulate DNA molecules in different ways. And just as modern microprocessors have a set of basic operations such as addition, shift, logical operations AND, OR and NOT NOR, DNA molecules under the influence of enzymes can perform such basic operations as cutting, copying, pasting, etc. And all the operations over DNA molecules can be performed in parallel and independently of other operations, for example, the addition of a DNA chain is carried out by acting on the original molecule of enzymes - polymerases. For the polymerase to work, it is necessary to have a single-stranded molecule (matrix) that determines the added strand according to the principle of complementarity, a primer (a small double-stranded region) and free nucleotides in solution. The process of complementing the DNA strand is shown in Fig. 2.

Rice. 2. The process of complementing the DNA strand
when exposed to the original polymerase molecule

There are polymerases that do not require templates to lengthen the DNA strand. For example, terminal transferase adds single strands of DNA to both ends of a double-stranded molecule. Thus, it is possible to construct an arbitrary DNA strand (Fig. 3).

Rice. 3. The process of lengthening the DNA chain

Enzymes - nucleases - are responsible for shortening and cutting DNA molecules. Distinguish between endonucleases and exonucleases. The latter can shorten both single-stranded and double-stranded molecules from one or both ends (Fig. 4), and endonucleases - only from the ends.

Rice. 4. The process of shortening the molecule
DNA under the influence of exonuclease

Cutting DNA molecules is possible under the influence of site-specific endonucleases - restriction enzymes, which cut them at a specific place encoded by a nucleotide sequence (recognition site). The incision can be straight or asymmetrical and go through the site of recognition or outside it. Endonucleases destroy internal bonds in the DNA molecule (Fig. 5).

Rice. 5. Cutting the DNA molecule
under the influence of restriction enzymes

Sewing - an operation opposite to cutting - occurs under the influence of enzymes - ligases. The sticky ends bond together to form hydrogen bonds. Ligases serve to close the notches, that is, to promote the formation of phosphodiester bonds in the right places, connecting the bases with each other within the same chain (Fig. 6).

Rice. 6. Cross-linking of DNA molecules under the influence of ligases

Another interesting operation on DNA molecules, which can be classified as basic, is modification. It is used to prevent restriction enzymes from finding a specific site and destroying the molecule. There are several types of modifying enzymes - methylase, phosphatase, etc.

The methylase has the same recognition site as the corresponding restriction enzyme. When the desired molecule is found, the methylase modifies the site with the site so that the restriction enzyme can no longer identify this molecule.

Copying, or reproduction, of DNA molecules is carried out during the polymerase chain reaction (PCR) - Fig. 7. The copying process can be divided into several stages: denaturation, priming and lengthening. It happens like an avalanche. At the first step, two molecules are formed from one molecule, at the second - from two molecules - four, and after n-steps, 2n molecules are already obtained.

Rice. 7. The process of copying a DNA molecule

Another operation that can be performed on DNA molecules is sequencing, that is, determining the sequence of nucleotides in DNA. Various methods are used for sequencing strands of different lengths. Using the primer-mediated walk method, it is possible to sequence a sequence of 250-350 nucleotides in one step. After the discovery of restriction enzymes, it became possible to sequence long sequences piece by piece.

Well, the last procedure that we will mention is gel electrophoresis, which is used to separate DNA molecules by length. If the molecules are placed in a gel and a constant electric field is applied, they will move towards the anode, with shorter molecules moving faster. Using this phenomenon, it is possible to realize sorting of DNA molecules by length.

DNA computing

DNA molecules with their unique form of structure and the ability to implement parallel computations allow a different look at the problem of computer computation. Traditional processors execute programs sequentially. Despite the existence of multiprocessor systems, multi-core processors and various technologies aimed at increasing the level of parallelism, basically all computers built on the basis of the von Neumann architecture are devices with a sequential instruction execution mode. All modern processors implement the following command and data processing algorithm: fetching instructions and data from memory and executing instructions on the selected data. This cycle repeats itself many times and at great speed.

DNA computing is based on a completely different, parallel architecture and in some cases it is precisely because of this that they are able to easily calculate those tasks that would take years for computers based on the von Neumann architecture to solve.

Edlman's experiment

The history of DNA computing begins in 1994. It was then that Leonard M. Adleman tried to solve a very trivial mathematical problem in an absolutely non-trivial way - using DNA computation. In fact, this was the first demonstration of a prototype biological computer based on DNA computation.

The problem that Edlman chose to do with DNA computation is known as finding a Hamiltonian path in a graph or choosing a travel salesman problem. Its meaning is as follows: there are several cities that need to be visited, and you can visit each city only once.

Knowing the point of departure and destination, it is necessary to determine the travel route (if it exists). In this case, the route is drawn up taking into account possible flights and connections of various flights.

So, suppose there are only four cities (seven cities were used in Adleman's experiment): Atlanta, Boston, Detroit, and Chicago. The traveler is tasked with choosing a route to get from Atlanta to Detroit, having visited each city only once. Schemes of possible communications between cities are shown in Fig. eight.

Rice. 8. Schemes of possible messages
between cities

It is easy to see (it only takes a few seconds) that the only possible route (the Hamiltonian path) is Atlanta - Boston - Chicago - Detroit.

Indeed, with a small number of cities, it is quite easy to draw up such a route. But with an increase in their number, the complexity of solving the problem increases exponentially and becomes difficult not only for a person, but also for a computer.

So, in fig. 9 shows a graph of seven vertices with an indication of the possible transitions between them. It takes an ordinary person no more than one minute to find the Hamiltonian path. It was this graph that was used in Adleman's experiment. In fig. 10 shows a graph of 12 vertices - in this case, finding a Hamiltonian path is no longer such an easy task. In general, the complexity of solving the problem of finding a Hamiltonian path increases exponentially with an increase in the number of vertices in the graph. For example, for a graph of 10 vertices, there are 106 possible paths; for a graph of 20 vertices - 1012, and for a graph of 100 vertices - 10100 options. It is clear that in the latter case it will take a huge time even for a modern supercomputer to generate all possible paths and test them.

Rice. 9. Finding the optimal travel route

Rice. 10. A graph consisting of 12 vertices

So, let's return to our example of finding the Hamiltonian path in the case of four cities (see Fig. 8).

To solve this problem using DNA computation, Edleman encoded the name of each city as a single strand of DNA, each containing 20 base bases. For simplicity, we will code each city with an eight-base DNA strand. DNA codes of cities are shown in table. 1. Note that an eight base base string is redundant to encode only four cities.

Table 1. DNA codes of cities

Note that for each city's DNA code that defines a single DNA strand, there is also a complementary strand, that is, a complementary DNA city code, and both the city's DNA code and the complementary code are absolutely equal.

Then, using single DNA strands, it is necessary to encode all possible flights (Atlanta - Boston, Boston - Detroit, Chicago - Detroit, etc.). For this, the following approach was used. The last four base bases were taken from the name of the city of departure, and the first four from the name of the city of arrival.

For example, the flight Atlanta - Boston will correspond to the following sequence: GCAG TCGG (Fig. 11).

Rice. 11. Coding of flights between cities

The DNA coding of all possible flights is shown in table. 2.

Table 2. DNA codes of all possible flights

So, after the codes of the cities and possible flights between them are ready, we can proceed directly to the calculation of the Hamiltonian path. The calculation process consists of four stages:

  1. Generate all possible routes.
  2. Select routes that start in Atlanta and end in Detroit.
  3. Select routes, the length of which corresponds to the number of cities (in our case, the length of the route is four cities).
  4. Select routes in which each city is present only once.

So, in the first step, we have to generate all possible routes. Recall that the correct route corresponds to flights Atlanta - Boston - Chicago - Detroit. This route corresponds to the DNA molecule GCAG TCGG ACTG GGCT ATGT CCGA.

In order to generate all possible routes, it is enough to put all the necessary and prepared ingredients in a test tube, that is, DNA molecules corresponding to all possible flights, and DNA molecules corresponding to all cities. But instead of using single DNA strands corresponding to the names of cities, it is necessary to use complementary DNA strands to them, that is, instead of the ACTT GCAG DNA strand corresponding to Atlanta, we will use the complementary DNA strand TGAA CGTC, etc., since The city's DNA code and the complementary code are absolutely equal.

Then we put all these molecules (just a pinch is enough, which will contain about 1014 different molecules), put into water, add ligases, pronounce a spell and ... literally in a few seconds we get all possible routes.

The process of formation of DNA chains corresponding to different routes is as follows. Consider, for example, the GCAG TCGG chain, which is responsible for the Atlanta-Boston flight. Due to the high concentration of various molecules, this strand will necessarily meet with the complementary AGCC TGAC DNA strand corresponding to Boston. Since the TCGG and AGCC groups are complementary to each other, due to the formation of hydrogen bonds, these chains are linked to each other (Fig. 12).

Rice. 12. Coupling of chains corresponding
flight Atlanta - Boston and Boston

Now the formed chain will inevitably meet with the ACTG GGCT DNA chain corresponding to the Boston-Chicago flight, and since the ACTG group (the first four bases in this chain) is complementary to the TGAC group (the last four bases in the Boston complementary code), the ACTG GGCT DNA chain will join to an already formed chain. Further, this chain will be joined in the same way by the DNA chain corresponding to the city of Chicago (complementary code), and then by the Chicago-Detroit flight chain. The process of forming a route is shown in Fig. thirteen.

Rice. 13. The process of formation of a DNA strand corresponding to the route
Atlanta - Boston - Chicago - Detroit

We have considered an example of the formation of only one route (and this is precisely the Hamiltonian route). All other possible routes are obtained in a similar way (for example, Atlanta - Boston - Atlanta - Detroit). It is important that all routes are formed simultaneously, that is, in parallel. Moreover, the time required to create all possible routes in a given problem and all routes in a problem with 10 or 20 cities is absolutely the same (if only the initial DNA molecules were enough). Actually, it is in the parallel algorithm of DNA computations that the main advantage lies in comparison with the von Neumann architecture.

So, DNA molecules are formed in a test tube that correspond to all possible routes. However, this is not yet a solution to the problem - we need to isolate the only DNA molecule that is responsible for the Hamiltonian route. Therefore, the next step is to select the molecules corresponding to the routes starting in Atlanta and ending in Detroit.

For this, polymerase chain reaction (PCR) is used, as a result of which many copies are created only of those DNA strands that start with the Atlanta code and end with the Detroit code.

To implement the polymerase chain reaction, two primes are used: GCAG and GGCT. The process of copying DNA models starting with the Atlanta DNA code and ending with the Detroit DNA code is shown in Fig. 14.

Rice. 14. The process of copying DNA molecules during the PCR reaction

Note that in the presence of GCAG and GGCT primes, DNA molecules that begin with the Atlanta DNA codes but do not end with the Detroit DNA code (under the action of the GCAG prime) will also be copied, as well as DNA molecules that end with the DNA code Detroit, but do not start with Atlanta DNA (under the influence of GGCT prime). It is clear that the copying speed of such molecules will be much lower than the copying speed of DNA molecules starting with the Atlanta DNA code and ending with the Detroit DNA code. Therefore, after the PCR reaction, we will receive a predominant amount of DNA molecules in the form of a regular double helix, corresponding to routes starting in Atlanta and ending in Detroit.

At the next stage, it is necessary to select the molecules of the required length, that is, those that contain the DNA codes of exactly four cities. For this, gel electrophoresis is used, which allows you to sort the molecules by length. As a result, we get molecules of the required length (exactly four cities), starting with the Atlanta code and ending with the Detroit code.

Now you need to make sure that the selected molecules contain the code of each city only once. This operation is accomplished using a process known as affinity purification.

For this operation, a microscopic magnetic ball with a diameter of about one micron is used. It is attracted by the complementary DNA codes of a particular city, which perform the function of a probe. For example, if you want to check whether the Boston city code is present in the DNA strand under study, you must first place the magnetic ball in a test tube with DNA molecules corresponding to the Boston DNA codes. As a result, we get a magnetic ball covered with the samples we need. Then this ball is placed in a test tube with the DNA strands under investigation - as a result, DNA strands in which the complementary Boston code is present will be attracted to it (due to the formation of hydrogen bonds between complementary groups). Then the ball with the sorted molecules is removed and placed in a new solution, from which it is then removed (when the temperature rises, the DNA molecules fall off the ball). This procedure (sorting) is repeated sequentially for each city, and as a result, we get only those molecules that contain the DNA codes of all cities, and hence the routes corresponding to the Hamiltonian path. In fact, the problem has been solved - all that remains is to calculate the answer.

Conclusion

Edleman demonstrated the solution to the problem of finding a Hamiltonian path using the example of only seven cities and spent seven days on it. This was the first experiment to demonstrate the power of DNA computing. In fact, Edlman proved that using DNA computations, it is possible to effectively solve enumeration problems, and outlined a technique that later served as the basis for creating a parallel filtration model.

However, many researchers are not optimistic about the future of biological computers. Here's just a small example. If by this method it was necessary to find a Hamiltonian path in a graph consisting of 200 vertices, the number of DNA molecules would be required, comparable in weight to our entire planet! This fundamental limitation is, of course, a kind of deadlock. Therefore, many research laboratories (for example, IBM) have chosen to focus their attention on other ideas for alternative computers, such as carbon nanotubes and quantum computers.

Since Adleman's experiment, there have been many other studies of the possibilities of DNA computing. For example, you can recall the experience of E. Shapiro: a finite state machine was implemented in it, which can be in two states: S0 and S1 - and answers the question: an even or odd number of characters is contained in the input sequence of characters.

Today, DNA computing is nothing more than promising technologies at the level of laboratory research, and they will be in this state for more than one year. In fact, at the present stage of development, it is necessary to answer the following global question: what class of problems can be solved with the help of DNA and is it possible to build a general model of DNA computation, suitable for both implementation and use?

A spatial model of the DNA molecule in 1953 was proposed by American researchers geneticist James Watson (born 1928) and physicist Francis Crick (born 1916). For his outstanding contribution to this discovery, he was awarded the 1962 Nobel Prize in Physiology or Medicine.

Deoxyribonucleic acid (DNA) is a biopolymer, the monomer of which is a nucleotide. Each nucleotide contains a phosphoric acid residue combined with sugar by deoxyribose, which, in turn, is combined with a nitrogenous base. There are four types of nitrogenous bases in the DNA molecule: adenine, thymine, guanine and cytosine.

A DNA molecule consists of two long chains, intertwined in the form of a spiral, most often, right-handed. The exception is viruses that contain single-stranded DNA.

Phosphoric acid and sugar, which are part of the nucleotides, form the vertical base of the helix. The nitrogenous bases are located perpendicularly and form "bridges" between the spirals. The nitrogenous bases of one chain are linked to the nitrogenous bases of the other chain according to the principle of complementarity, or correspondence.

The principle of complementarity. In the DNA molecule, adenine combines only with thymine, guanine - only with cytosine.

The nitrogenous bases are optimally matched to each other. Adenine and thymine are linked by two hydrogen bonds, guanine and cytosine by three. Therefore, more energy is required to break the guanine-cytosine bond. Thymine and cytosine of the same size are much smaller than adenine and guanine. The thymine-cytosine pair would be too small, the adenine-guanine time would be too large, and the DNA helix would bend.

Hydrogen bonds are fragile. They break easily and just as easily repair. The chains of the double helix can be pulled apart like a zipper by the action of enzymes or at high temperatures.

5. RNA molecule Ribonucleic acid (RNA)

A molecule of ribonucleic acid (RNA) is also a biopolymer, which consists of four types of monomers - nucleotides. Each monomer of an RNA molecule contains a phosphoric acid residue, a ribose sugar and a nitrogenous base. Moreover, the three nitrogenous bases are the same as in DNA - adenine, guanine and cytosine, but instead of thymine, the RNA contains a structurally similar uracil. RNA is a single-stranded molecule.

The quantitative content of DNA molecules in cells of any kind is practically constant, but the amount of RNA can vary significantly.

Types of rna

There are three types of RNA, depending on the structure and function performed.

1. Transport RNA (tRNA). Transport RNAs are mainly found in the cytoplasm of the cell. They carry amino acids to the site of protein synthesis in the ribosome.

2. Ribosomal RNA (rRNA). Ribosomal RNA binds to certain proteins and forms ribosomes - organelles in which proteins are synthesized.

3. Messenger RNA (mRNA), or messenger RNA (mRNA). Messenger RNA carries information about the structure of the protein from the DNA to the ribosome. Each mRNA molecule corresponds to a specific section of DNA that encodes the structure of one protein molecule. Therefore, each of the thousands of proteins that are synthesized in the cell has its own specific mRNA.

The monomer units of which are nucliatides.

What is DNA?

All information about the structure and functioning of any living organism is contained in a coded form in its genetic material. The basis of the body's genetic material is deoxyribonucleic acid (DNA).

DNA in most organisms it is a long, double-stranded polymer molecule. Sequence monomer units (deoxyribonucleotides) in one of its chains corresponds to ( complementary) the sequence of deoxyribonucleotides in another. The principle of complementarity ensures the synthesis of new DNA molecules, identical to the original ones, when they are doubled ( replication).

A section of a DNA molecule that encodes a certain trait - gene.

Genes- these are individual genetic elements that have a strictly specific nucleotide sequence and coding for certain characteristics of the organism. Some of them encode proteins, others only RNA molecules.

The information contained in the genes encoding proteins (structural genes) is decoded in two sequential processes:

  • RNA synthesis (transcription): in a certain region of DNA, as on a template, is synthesized messenger RNA (mRNA).
  • protein synthesis (translation): During the coordinated work of a multicomponent system with the participation of transport RNA (tRNA), mRNA, enzymes and various protein factors carried out protein synthesis.

All these processes ensure the correct translation of the genetic information encoded in DNA from the language of nucleotides into the language of amino acids. Amino acid sequence of a protein molecule defines its structure and functions.

DNA structure

DNA- it linear organic polymer... His - nucleotides, which, in turn, consist of:

In this case, the phosphate group is attached to 5'-carbon atom monosaccharide residue, and the organic base - to 1'-atom.

There are two types of bases in DNA:


The structure of nucleotides in the DNA molecule

V DNA monosaccharide presented 2'-deoxyribose containing only 1 hydroxyl group (OH) and in RNA - ribose having 2 hydroxyl groups (OH).

Nucleotides are linked to each other phosphodiester bonds, while the phosphate group 5'-carbon atom one nucleotide linked to 3'-OH-group of deoxyribose adjacent nucleotide (Figure 1). At one end of the polynucleotide chain is Z'-OH-group (Z'-end), and on the other - 5'-phosphate group (5'-end).

DNA structure levels

It is customary to distinguish 3 levels of DNA structure:

  • primary;
  • secondary;
  • tertiary.

Primary DNA structure Is the sequence of the arrangement of nucleotides in the polynucleotide chain of DNA.

Secondary structure of DNA stabilizes between complementary base pairs and is a double helix of two antiparallel chains twisted to the right around one axis.

The general turn of the spiral 3.4nm, distance between chains 2nm.

The tertiary structure of DNA is DNA supersperalization. The double helix of DNA in some regions can undergo further spiralization with the formation of a supercoil or an open circular shape, which is often caused by the covalent connection of their open ends. The supercoiled DNA structure allows economical packing of a very long DNA molecule in the chromosome. So, in an elongated form, the length of a DNA molecule is 8 cm, and in the form of a super spiral fits into 5 nm.

Chargaff's rule

E. Chargaff's rule Is the regularity of the quantitative content of nitrogenous bases in the DNA molecule:

  1. DNA molar fractions purine and pyrimidine bases are equal: A +G = C+ T or (A +G)/(C + T) = 1.
  2. In DNA the number of bases with amino groups (A +C) equals the number of bases with keto groups (G+ T):A +C= G+ T or (A +C)/(G+ T) = 1
  3. Equivalence rule, that is: A = T, G = C; A / T = 1; G / C = 1.
  4. DNA nucleotide composition in organisms of various groups is specific and characterized specificity coefficient: (G + C) / (A + T). In higher plants and animals specificity coefficient less than 1, and varies slightly: from 0,54 before 0,98 , in microorganisms it is more than 1.

Watson-Crick DNA Model

1953 James Watson and Francis Scream, based on the data of X-ray structural analysis of DNA crystals, came to the conclusion that native DNA consists of two polymer chains forming a double helix (Figure 3).

Coiled polynucleotide chains are held together hydrogen bonds formed between complementary bases of opposite chains (Figure 3). Wherein adenine pairs only with thymine, a guanine- With cytosine... A pair of bases AT stabilizes two hydrogen bonds and a couple G-C - three.

The length of double-stranded DNA is usually measured by the number of pairs of complementary nucleotides ( P.n.). For DNA molecules consisting of thousands or millions of nucleotide pairs, units are accepted so on and m.p.n. respectively. For example, the DNA of human chromosome 1 is one double helix length 263 m.p.n.

Sugar-phosphate backbone of the molecule, which consists of phosphate groups and deoxyribose residues connected 5'-3'-phosphodiester linkages, forms the "sidewalls of a spiral staircase", and the base pairs AT and G-C- its steps (Figure 3).

Figure 3: Watson-Crick DNA Model

DNA molecule chains antiparallel: one of them has a direction 3 ’→ 5 ′, another 5 ’→ 3 ′... In accordance with the principle of complementarity if one of the strands contains a nucleotide sequence 5-TAGGCAT-3 ′, then in the complementary strand in this place there should be a sequence 3′-ATCCGTA-5 ′... In this case, the double-stranded form will look like this:

  • 5′-TAGGCAT-3 ′
  • 3-ATCCGTA-5 '.

In such a record 5'-end of the upper chain are always placed on the left, and 3'-end- on right.

The carrier of genetic information must meet two basic requirements: reproduce (replicate) with high fidelity and determine (encode) the synthesis of protein molecules.

Watson-Crick DNA Model fully meets these requirements, since:

  • according to the principle of complementarity, each DNA strand can serve as a template for the formation of a new complementary strand. Therefore, after one round, two daughter molecules are formed, each of which has the same nucleotide sequence as the original DNA molecule.
  • the nucleotide sequence of the structural gene uniquely defines the amino acid sequence of the protein it encodes.
  1. One human DNA molecule contains about 1.5 gigabytes of information... At the same time, the DNA of all cells of the human body occupies 60 billion terabytes, which is stored in 150-160 grams of DNA.
  2. International DNA Day celebrated on April 25th. It was on this day in 1953 James watson and Francis Creek published in the magazine Nature my article titled "Molecular structure of nucleic acids" , which described the double helix of the DNA molecule.

Bibliography: Molecular Biotechnology: Principles and Applications, B. Glick, J. Pasternak, 2002

Many people have always wondered why some of the traits that parents have are passed on to the child (for example, eye color, hair, face shape, and others). Science has shown that this transmission of a trait depends on the genetic material, or DNA.

What is DNA?

Nucleotide

As mentioned, the basic structural unit of deoxyribonucleic acid is the nucleotide. This is a complex education. The composition of the DNA nucleotide is as follows.

At the center of the nucleotide is a five-component sugar (in DNA, in contrast to RNA, which contains ribose). It is joined by a nitrogenous base, which are distinguished by 5 types: adenine, guanine, thymine, uracil and cytosine. In addition, each nucleotide also contains a phosphoric acid residue.

DNA contains only those nucleotides that have the indicated structural units.

All nucleotides are arranged in a chain and follow one another. Grouped into triplets (three nucleotides each), they form a sequence in which each triplet corresponds to a specific amino acid. As a result, a chain is formed.

They combine with each other due to the bonds of nitrogenous bases. The main bond between the nucleotides of parallel chains is hydrogen.

Nucleotide sequences are the backbone of genes. A violation in their structure leads to a failure in protein synthesis and the manifestation of mutations. DNA contains the same genes that are defined in almost all people and distinguish them from other organisms.

Nucleotide modification

In some cases, modification of the nitrogenous base is used for a more stable transmission of a particular trait. The chemical composition of DNA changes due to the addition of a methyl group (CH3). Such a modification (on one nucleotide) allows to stabilize gene expression and transfer of traits to daughter cells.

This “improvement” of the molecular structure does not in any way affect the combination of nitrogenous bases.

This modification is also used for the inactivation of the X chromosome. As a result, Barr's bodies are formed.

With enhanced carcinogenesis, DNA analysis shows that the nucleotide chain has been methylated on multiple bases. In the conducted observations, it was noticed that the source of the mutation is usually methylated cytosine. Usually, in a tumor process, demethylation can help stop the process, but due to its complexity, this reaction is not carried out.

DNA structure

In the structure of the molecule, two types of structure are distinguished. The first type is a linear sequence formed by nucleotides. Their construction is subject to certain laws. The recording of nucleotides on a DNA molecule starts at the 5'-end and ends at the 3'-end. The second opposite chain is constructed in the same way, only in the spatial relation the molecules are opposite each other, and the 5'-end of one chain is opposite the 3'-end of the second.

The secondary structure of DNA is a spiral. It is caused by the presence of hydrogen bonds between nucleotides located opposite each other. A hydrogen bond is formed between complementary nitrogenous bases (for example, only thymine can be opposite to adenine of the first chain, and cytosine or uracil can be opposite to guanine). This accuracy is due to the fact that the construction of the second chain is based on the first, therefore there is an exact correspondence between the nitrogenous bases.

Molecule synthesis

How is the DNA molecule formed?

In the cycle of its formation, three stages are distinguished:

  • Disconnection of chains.
  • Attachment of synthesizing units to one of the chains.
  • Completion of the second chain according to the principle of complementarity.

At the stage of separation of the molecule, the main role is played by enzymes - DNA gyrase. These enzymes are focused on breaking hydrogen bonds between chains.

After the divergence of the chains, the main synthesizing enzyme, DNA polymerase, comes into play. Its joining is observed at section 5 '. Further, this enzyme moves towards the 3'-end, concurrently attaching the necessary nucleotides with the corresponding nitrogenous bases. Having reached a certain site (terminator) at the 3'-end, the polymerase is detached from the original chain.

After the daughter chain has formed, a hydrogen bond is formed between the bases, which holds the newly formed DNA molecule together.

Where can you find this molecule?

If you delve into the structure of cells and tissues, you can see that DNA is mainly contained in is responsible for the formation of new, daughter, cells or their clones. At the same time, what is in it is divided between the newly formed cells evenly (clones are formed) or in parts (such a phenomenon can often be observed during meiosis). Damage to the nucleus entails a disruption in the formation of new tissues, which leads to mutation.

In addition, mitochondria contain a special type of hereditary material. In them, the DNA is somewhat different from that in the nucleus (mitochondrial deoxyribonucleic acid is ring-shaped and performs slightly different functions).

The molecule itself can be released from any cells of the body (for research, a smear from the inside of the cheek or blood is most often used). There is no genetic material only in the exfoliating epithelium and some blood cells (erythrocytes).

Functions

The composition of the DNA molecule determines its performance of the function of transmitting information from generation to generation. This occurs due to the synthesis of certain proteins that determine the manifestation of one or another genotypic (internal) or phenotypic (external - for example, eye or hair color) trait.

The transfer of information is carried out by implementing it from the genetic code. Based on the information encoded in the genetic code, specific informational, ribosomal and transport RNAs are produced. Each of them is responsible for a specific action - messenger RNA is used to synthesize proteins, ribosomal is involved in the assembly of protein molecules, and transport forms the corresponding proteins.

Any failure in their work or a change in structure leads to a disruption in the function performed and the appearance of atypical signs (mutations).

A DNA paternity test allows you to determine the presence of related traits between people.

Genetic tests

What can genetic material research be used for at the present time?

DNA analysis is used to determine many factors or changes in the body.

First of all, the study allows you to determine the presence of congenital, inherited diseases. Such diseases include Down syndrome, autism, Marfan syndrome.

DNA can also be examined to determine family ties. The paternity test has long been widely used in many, primarily legal, processes. This study is prescribed when determining the genetic relationship between illegitimate children. Often this test is taken by applicants for inheritance when questions arise from the authorities.

The chemical composition of DNA and its macromolecular organization. Types of DNA helices. Molecular mechanisms of DNA recombination, replication and repair. The concept of nucleases and polymerases. DNA replication as a condition for the transmission of genetic information to descendants. General characteristics of the replication process. Actions that take place in a replication fork. Telomere replication, telomerase. Significance of underreplication of terminal chromosome fragments in the aging mechanism. Replication error correction systems. Corrective properties of DNA polymerases. Damaged DNA repair mechanisms. The concept of DNA repair diseases. Molecular mechanisms of general genetic recombination. Site-specific recombination. Gene conversion.

In 1865. Gregor Mendel discovered genes, and his contemporary Friedrich Mischer in 1869. discovered nucleic acids (in the nuclei of cells of pus and sperm of salmon). However, for a long time these discoveries were not connected with each other, for a long time the structure and nature of the substance of heredity were not known. The genetic role of NC was established after the discovery and explanation of the phenomena of transformation (1928, F. Griffiths; 1944, O. Avery), transduction (1951, Lederberg, Zinder) and reproduction of bacteriophages (1951, A. Hershey, M. Chase).

The transformation, transduction and reproduction of bacteriophages have convincingly proved the genetic role of DNA. In RNA-containing viruses (AIDS, hepatitis B, influenza, TMV, mouse leukemia, etc.), this role is played by RNA.

Nucleic acid structure... NK - biopolymers involved in the storage and transmission of genetic information. NR monomers are nucleotides consisting of a nitrogenous base, a monosaccharide, and one or more phosphate groups. All nucleotides in NA are monophosphates. A nucleotide without a phosphate group is called a nucleoside. Sugar included in NA is the D-isomer and β-anomer of ribose or 2-deoxyribose. Nucleotides containing ribose are called ribonucleotides and are monomers of RNA, and nucleotides derived from deoxyribose are deoxyribonucleotides, and DNA is made up of them. There are two types of nitrogenous bases: purines - adenine, guanine and pyrimidines - cytosine, thymine, uracil. The composition of RNA and DNA includes adenine, guanine, cytosine; uracil is found only in RNA, and thymine is found only in DNA.

In some cases, NC contains rare minor nucleotides, such as dihydrouridine, 4-thiouridine, inosine, etc. Their diversity is especially great in tRNA. Minor nucleotides are formed as a result of chemical transformations of NC bases, which occur after the formation of the polymer chain. Various methylated derivatives are extremely common in RNA and DNA: 5-methyluridine, 5-methylcytidine, l-N-methyladenosine, 2-I-methylguanosine. In RNA, the object of methylation can also be 2 "-hydroxy groups of ribose residues, which leads to the formation of 2" -O-methylcytidine or 2 "-O-methylguanosine.

The ribonucleotide and deoxyribonucleotide units are interconnected by phosphodiester bridges linking the 5 "hydroxyl group of one nucleotide to the 3" hydroxyl group of the next. Thus, the regular backbone is formed by phosphate and ribose residues, and the bases are attached to sugars in the same way as the side groups in proteins are attached. The order of the bases along the chain is called the primary structure of the NC. The sequence of bases is usually read in the direction from 5 "- to 3" - the carbon atom of pentose.

DNA structure. The model of the structure of DNA in the form of a double helix was proposed by Watson and Crick in 1953 (Fig. 7).

According to this three-dimensional model, a DNA molecule consists of two oppositely directed polynucleotide chains, which form a right helix about the same axis. Nitrogenous bases are located inside the double helix, and their planes are perpendicular to the main axis, and sugar-phosphate residues are exposed outside. Specific H-bonds are formed between the bases: adenine - thymine (or uracil), guanine - cytosine, called Watson-Crick pairing. As a result, the bulkier purines always interact with the smaller pyrimidines, which provide an optimal backbone geometry. The antiparallel strands of the double helix are not identical either in base sequence or in nucleotide composition, but they are complementary to each other precisely due to the presence of specific hydrogen bonding between the above bases.

Complementarity is very important for DNA replication. The relationship between the number of different bases in DNA, identified

Fig. 7. B - form of DNA

Chargraff et al. in the 50s, were of great importance for establishing the structure of DNA: it was shown that the number of adenine residues in the bases of the DNA chain, regardless of the organism, is equal to the number of thymine residues, and the number of guanine residues is equal to the number of cytosine residues. These equalities are a consequence of selective base pairing (Fig. 8).

The geometry of the double helix is ​​such that adjacent base pairs are at a distance of 0.34 nm from each other and rotated by 36 ° around the axis of the helix. Consequently, there are 10 base pairs per one turn of the helix, and the helix pitch is 3.4 nm. The double helix has a diameter of 20 nm and two grooves are formed in it - a large and a small one. This is due to the fact that the sugar-phosphate backbone is located farther from the axis of the spiral than the nitrogenous bases.

The stability of the DNA structure is due to different types of interaction, among which the main are H-bonds between bases and interplanar interaction (stacking). Thanks to the latter, not only favorable van der Waals contacts between atoms are provided, but also

Fig. 8. The principle of complementarity and antiparallelism of DNA strands

additional stabilization due to overlapping of p-orbitals of atoms in parallel bases. Stabilization is also facilitated by a favorable hydrophobic effect, manifested in the protection of low-polarity bases from direct contact with an aqueous medium. In contrast, the sugar-phosphate backbone with its polar and ionized groups is exposed, which also stabilizes the structure.

For DNA, four polymorphic forms are known: A, B, C and Z. The usual structure is B-DNA, in which the planes of base pairs are perpendicular to the axis of the double helix (Fig. 7.). In A-DNA, the base pair planes are rotated approximately 20 ° from the normal to the axis of the right double helix; there are 11 base pairs per turn of the spiral. In C-DNA, there are 9 base pairs on the helix. Z-DNA is a left-handed helix with 12 base pairs per turn; the planes of the bases are approximately perpendicular to the axis of the spiral. DNA in a cell is usually in B-form, but some of its sections can be in A, Z or even in a different conformation.

The double helix of DNA is not a frozen formation, it is in constant motion:

· The links in the chains are deformed;

· Open and close complementary base pairs;

· DNA interacts with proteins;

• if the voltage in the molecule is high, then it is locally unweaved;

· The right spiral turns into the left one.

There are 3 DNA fractions:

1. Frequently repeated (satellite) - up to 106 copies of genes (10% in mice). She is not involved in protein synthesis; shares genes; provides crossing over; contains transposons.

2. Slightly repeatable - up to 102 - 103 copies of genes (15% in mice). Contains genes for the synthesis of t-RNA, genes for the synthesis of ribosome proteins and chromatin proteins.

3.Unique (non-repeatable) - 75% in mice (56% in humans). Consists of structural genes.

Localization of DNA: 95% of DNA is localized in the nucleus in chromosomes (linear DNA) and 5% in mitochondria, plastids and the cell center in the form of circular DNA.

DNA functions: storage and transmission of information; repair; replication.

Two DNA strands in the gene region are fundamentally different in their functional role: one of them is coding, or semantic, the other is matrix.

This means that in the process of "reading" a gene (transcription or synthesis of pre-mRNA), the template DNA chain acts as a template. The product of this process, pre-mRNA, coincides in its nucleotide sequence with the coding strand of DNA (with the replacement of thymine bases with uracil bases).

Thus, it turns out that with the help of the template DNA strand during transcription, the genetic information of the coding DNA strand is reproduced in the RNA structure.

The main matrix processes inherent in all living organisms are DNA replication, transcription and translation.

Replication- the process by which information encoded in the base sequence of the parent DNA molecule is transmitted with maximum accuracy to the daughter DNA. In semi-conservative replication, daughter cells of the first generation receive one DNA strand from their parents, and the second strand is newly synthesized. The process is carried out with the participation of DNA polymerases, which belong to the class of transferases. The role of the matrix is ​​played by the separated strands of the double-stranded maternal DNA, and the substrates are deoxyribonucleoside-5 "-triphosphates.

Transcription- the process of transferring genetic information from DNA to RNA. All types of RNA - mRNA, rRNA and tRNA - are synthesized according to the sequence of bases in the DNA, which serves as a template. Only one, the so-called "+" - DNA strand is transcribed. The process takes place with the participation of RNA polymerases. The substrates are ribonucleoside-5 "-triphosphates.

The processes of replication and transcription in prokaryotes and eukaryotes differ significantly in their rate of occurrence and in individual mechanisms.

Broadcast- the process of decoding mRNA, as a result of which information from the language of the mRNA base sequence is translated into the language of the amino acid sequence of the protein. The translation is carried out on ribosomes, the substrates are aminoacyl-tRNA.

Matrix DNA synthesis, catalyzed by DNA polymerases, performs two main functions: DNA replication - the synthesis of new daughter strands and the repair of double-stranded DNA with breaks in one of the strands resulting from the excision of damaged parts of this chain by nucleases. There are three types of DNA polymerases in prokaryotes and eukaryotes. In prokaryotes, polymerases of types I, II and III are isolated, designated as pol l, pol ll and pol III. The latter catalyzes the synthesis of the growing chain, pol plays an important role in the process of DNA maturation, and the functions of pol ll are not fully understood. In eukaryotic cells, DNA polymerase ά is involved in chromosome replication, DNA polymerase β is involved in repair, and the γ variety is an enzyme that replicates mitochondrial DNA. These enzymes, regardless of the type of cells in which replication occurs, attach a nucleotide to the OH group at the 3 "end of one of the DNA strands, which grows in the 5" → 3 direction. Therefore, these F are said to have 5 "→ 3" -polymerase activity. In addition, they all exhibit the ability to degrade DNA, cleaving nucleotides in the 3 "→ 5" direction, that is, they are 3 "→ 5" exonucleases.

In 1957, Meselson and Stahl, studying E. coli, found that on each free strand, the DNA polymerase enzyme builds a new, complementary strand. This is a semi-conservative way of replication: one chain is old, the other is new!

Usually, replication begins at strictly defined sites, called ori sites (from origin of replication), and spreads from these sites in both directions. The ori sites are preceded by branch points of the maternal DNA strands. The site adjacent to the branching point is called the replicative fork (Fig. 9). During synthesis, the replicative fork moves along the molecule, while all new sections of the parental DNA are unwound until the fork reaches the termination point. The separation of chains is achieved using special F - helicases (topoisomerases). The energy required for this is released through the hydrolysis of ATP. Helicases move along polynucleotide chains in two directions.

To start DNA synthesis, a primer is needed. A short RNA (10-60 nucleotides) plays the role of a primer. It is synthesized in a complementary manner to a specific DNA region with the participation of primase. After the formation of the primer, DNA polymerase is activated. Unlike helicases, DNA polymerases can only move from the 3 "to 5" end of the template. Therefore, elongation of the growing strand as the double-stranded maternal DNA unwinds can only proceed along one strand of the template, the one relative to which the replication fork moves from the 3 "to the 5" end. The continuously synthesized chain was called the leading one. Synthesis on the lagging strand also begins with the formation of a primer and proceeds in the direction opposite to the leading strand - from the replication fork. The lagging strand is synthesized fragmentarily (in the form of Okazaki fragments), since the primer is formed only when the replication fork releases that part of the template that has an affinity for the primase. Ligation (stitching) of Okazaki fragments to form a single chain is called the maturation process.

During chain maturation, the RNA primer is removed both from the 5 "end of the leading strand and from the 5" ends of the Okazaki fragments, and these fragments are ligated to each other. Removal of the primer is carried out with the participation of 3 "→ 5" exonuclease. The same F, instead of the removed RNA, attaches deoxynucleotides using its 5 "→ 3" polymerase activity. In this case, in the case of attachment of the "wrong" nucleotide, "proofreading" is carried out - the removal of bases that form non-complementary pairs. This process provides extremely high replication accuracy of one error in 109 bp.

Fig. 9. DNA replication:

1 - replicative fork, 2 - DNA polymerase (pol I - maturation);

3 - DNA polymerase (pol III - "proofreading"); 4-helicase;

5-gyrase (topoisomerase); 6-proteins destabilizing the double helix.


Correction is carried out in cases when the "wrong" nucleotide is attached to the 3 "end of the growing chain, which is unable to form the necessary hydrogen bonds with the matrix. When pol III mistakenly attaches the wrong base, its 3" - "5" exonuclease activity is "turned on" and this base is immediately removed, after which the polymerase activity is restored.This simple mechanism works due to the fact that pol III is able to work as a polymerase only on a perfect double helix of DNA with absolutely correct base pairing.

Another mechanism for removing RNA fragments is based on the presence of a special ribonuclease in cells, called RNase H. This F is specific to double-stranded structures built from one ribonucleotide and one deoxyribonucleotide chain, and it hydrolyzes the first of them.

RNase H is also capable of removing an RNA primer and then building up the gap with DNA polymerase. At the final stages of the assembly of fragments, DNA ligase acts in the desired order, catalyzing the formation of a phosphodiester bond.

The unwinding of a part of the DNA double helix in eukaryotic chromosomes by helicases leads to supercoiling of the rest of the structure, which inevitably affects the rate of the replication process. Supercoiling is prevented by DNA topoisomerases.

Thus, in addition to DNA polymerase, a large set of F is involved in DNA replication: helicase, primase, RNase H, DNA ligase, and topoisomerase. This does not exhaust the list of F and proteins involved in template DNA biosynthesis. However, many of the participants in this process are still poorly understood.

In the process of replication, "proofreading" occurs - the removal of incorrect (forming non-complementary pairs) bases included in the newly synthesized DNA. This process provides extremely high replication accuracy of one error in 109 bp.

Telomeres. In 1938. genetics classics B. McClinton and G. Möller proved that there are special structures at the ends of chromosomes, which they called telomeres (telos-end, meros-part).

Scientists have found that only telomeres exhibit resistance when exposed to X-rays. On the contrary, lacking end regions, chromosomes begin to fuse, leading to severe genetic abnormalities. Thus, telomeres provide chromosome individuality. Telomeres are tightly packed (heterochromatin) and are inaccessible for enzymes (telomerase, methylase, endonuclease, etc.)

Telomere functions.

1. Mechanical: a) connection of the ends of sister chromatids after the S-phase; b) fixation of chromosomes to the nuclear membrane, which ensures the conjugation of homologues.

2. Stabilization: a) protection from underreplication of genetically significant DNA sections (telomeres are not transcribed); b) stabilization of the ends of the torn chromosomes. In patients with α-thalassemia, chromosome 16d breaks in the α-globin genes and telomere repeats (TTAGGG) are added to the damaged end.

3. Influence on gene expression. The activity of genes located near telomeres is reduced. This manifestation of silence is transcriptional silence.

4. "Counting function". Telomeres act as a clock device that counts the number of cell divisions. Each division shortens telomeres by 50-65 bp. And their total length in the cells of the human embryo is 10-15 thousand bp.

Telomeric DNA has recently come to the attention of biologists. The first objects of research are unicellular protozoa - ciliary infusoria (tetrachimene), which contains several tens of thousands of very small chromosomes and, therefore, many telomeres in one cell (higher eukaryotes have less than 100 telomeres per cell).

In the telomeric DNA of ciliates, blocks of 6 nucleotide residues are repeated many times. One DNA strand contains a block 2 thymine - 4 guanine (TSGGGG - G-chain), and the complementary strand - 2 adenine - 4 cytosine (ААЦЦЦЦ - С-chain).

Imagine the surprise of scientists when they found that human telomeric DNA differs from that of ciliates in just one letter and forms blocks 2 thymine - adenine - 3 guanine (TTAGGG). Moreover, it turned out that telomeres (G - chain) of all mammals, reptiles, amphibians, birds and fish were built from TTAGGG - blocks.

However, there is nothing to be surprised at, since no proteins are encoded in telomeric DNA (it does not contain genes). In all organisms, telomeres perform universal functions, which were discussed above. A very important characteristic of telomeric DNA is their length. In humans, it ranges from 2 to 20 thousand base pairs, and in some species of mice it can reach hundreds of thousands of bp. It is known that there are special proteins near telomeres that ensure their work and are involved in the construction of telomeres.

It has been proven that for normal functioning, each linear DNA must have two telomeres: one telomere at each end.

Prokaryotes do not have telomeres - their DNA is closed in a ring.