framev.gif (975 bytes) Go to frame view (Recommended only for screen resolution 1024x768)

Go to contents Go to contents Go to previous web-page Go to previous web-page
Go to current chapter contents Go to current chapter contents Go to next web-pageGo to next web-page

5.4 Nomenclature, Abridged Formulas and Abbreviations

The fragments resulting from enzymatic or chemical hydrolysis of nucleic acids and their synthetic analogues are usually termed oligonucleotides if the molecule contains a small number of monomers. The prefix "oligo" comes from carbohydrate chemistry where oligomers (from Greek "oligos" - few + "meros" - part) are compounds occupying an intermediate position, in terms of size, between monomers and polymers. The natural and synthetic nucleotide polymers comprising more than ten monomer units are usually called polynucleotides. Synthetic poly(oligo)nucleotides may have internucleotide linkages other than those of the 3'-5' type (such as 5'-5' or 2'-5). According to the exact number of monomer units in the chain, oligonucleotides can be referred to as dinucleotides, trinucleotides, and so on. Similar terms are used to denote polynucleotides with a particular chain length. The names defining the number of monomer units in the chain have Greek or Latin roots. For example, a 15-membered nucleotide is known as pentadecanucleotide and a 20-membered one as icosanucleotide.

In accordance with the commonly accepted nomenclature of polypeptides, a polynucleotide is regarded as a chain in which every nucleotide esterifies the hydroxyl of the next nucleotide, rather than a set of nucleosides linked by phosphodiester groups. Therefore, the complete name of an oligo(poly)nucleotide is also composed as that of a polypeptide; that is, it includes the names of the nucleotide residue, separated, in parentheses, by the type of the internucleotide linkage between the corresponding monomer units.

The name of a polynucleotide usually begins with the 5' end (the residue of the 5'-terminal nucleotide being named first). In such cases, the internucleotide linkage is denoted as (3'®5), the arrow indicating the direction in which the name is composed (from the 5' toward the 3' end of the chain).

The name of a polynucleotide chain may also be composed in reverse order (from the 3' to the 5' end). Then, the internucleotide linkage should be denoted as (5' ® 3'). Given below by way of example are the names of a tetradeoxyribonucleotide and a triribonucleotide.

194~1.GIF (43563 bytes)

The choice of the direction in which the name is composed is usually determined by the nature of the terminal groups of the chain. Poly(oligo)nucleotides are written in a direction from the 5' to the 3' end, as a rule. The abbreviations in most cases replacing the cumbersome complete structural formulas are formed following the same rules that govern the writing of abridged formulas of monomer components. The only addition to these rules is the abbreviation of the internucleotide linkage (internucleotide phosphodiester group). Such linkage is symbolized by a curved or straight sloping (diagonal) line interrupted by lowercase "p". Thus, the internucleotide phosphorus atom is denoted in exactly the same way as the phosphate group in a monomer component or that at the end of the polynucleotide chain, while its phosphodiester nature is expressed as bonds with the 3'-hydroxyl (left diagonal line) and 5'-hydroxyl (right diagonal line).

In spite of the fact that abridged formulas greatly facilitate the writing of polynucleotides, they become useless when the structural formulas of nucleic acids are to be written (e.g., tRNAs consisting of 75-85 monomer units). In such instances, use is made of abbreviated letter symbols according to the rules that apply to nucleotides. In this case, too, the internucleotide phosphorus is denoted by lowercase "p", just as in the above scheme, inserted between the uppercase letters standing for the corresponding nucleotides. The letter "p" to the left of the uppercase letter denotes the 5'-phosphomonoester group, while the same letter to the right stands for the 3'-phosphomonoester group. The symbol of the internucleotide phosphorus ("p") can be replaced by a hyphen, which further simplifies the structural formulas of nucleic acids, or even dropped altogether. In this case, only the symbols of the terminal phosphate groups ("p") are left. In such abridged notation, deoxyribopolynucleotides differ from ribopolynucleotides by the letter "d" added before the abbreviation in parentheses. For example, decaribonucleotide and decadeoxyribonucleotide can be written as follows:

Decaribonucleotide Decadeoxyribonucleotide
pUpGpCpCpApUpGpApGpU d(pTpGpCpCpApTpGpApGpT)
or or
pU-G-C-C-A-U-G-A-G-U d(pT-G-C-C-A-T-G-A-G-T)
or or
pUGCCAUGAGU d(pTGCCATGAGT)

In the case of homopolynucleotides - that is, polynucleotides consisting only of one monomer, the formulas become even simpler. The formula of oligothymidylic acid, for example, can be written as follows:

d(pT)n, where n £ 10

As can be seen from this formula, oligothymidylic acid contains ten or fewer nucleotides, has a 5'-terminal phosphate group and a free 3'-hydroxyl in pentose.

Sometimes, yet simpler formulas are used. For linstance, the above oligomer can also be written as d(T)£ 10, or d(T)n, or oligo(dT).

High-molecular weight homopolymers are written using a single letter (nucleoside symbol) following the prefix "poly". For example, the polymer of deoxyriboadenylic acid is written as poly(dA) and that of riboadenylic acid is written as poly(A).