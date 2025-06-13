A codon is a sequence of three nucleotides that determines the amino acid sequence of a protein. There are 64 different codons, 61 of which specify amino acids, while the remaining three act as stop signals to indicate the end of protein synthesis. Each codon instructs a cell to start creating a protein chain, add a specific amino acid to the chain, or stop creating the chain. The genetic code, which determines how a cell interprets the nucleotide sequence, was once believed to be universal, but it is now understood that it evolves, resulting in variations in how a codon is translated depending on the genetic source.

Characteristics Values Definition of a codon A sequence of three consecutive nucleotides that occurs in mRNA Number of codons 64 Number of codons specifying amino acids 61 Number of codons acting as stop signals 3 Number of nucleotides in mRNA 4 (A, U, G, and C) Number of amino acids 20 Number of codons per amino acid Most amino acids are indicated by more than one codon. Tryptophan is the only amino acid specified by a single codon. The remaining 19 amino acids are specified by between two and six codons each. Start codon AUG, which corresponds to the amino acid methionine Stop codons UAG, UGA, and UAA Alternative genetic codes Four novel alternative genetic codes (numbered 34–37) were discovered in bacterial genomes by Shulgina and Eddy

What You'll Learn Codon redundancy and degeneracy

Start and termination codons

Transfer RNA (tRNA)

Translation and initiation

The evolution of the genetic code

Codon redundancy and degeneracy

The concept of codons was first described by Francis Crick and his colleagues in 1961. A codon is a sequence of three consecutive nucleotides that occurs in mRNA, which directs the incorporation of a specific amino acid into a protein or represents the starting or termination signals of protein synthesis. There are 64 possible codons, 61 of which specify amino acids, and three are used as stop signals.

Each codon is specific to only one amino acid or one stop signal. However, the genetic code is described as degenerate or redundant because a single amino acid may be coded for by more than one codon. This degeneracy or redundancy of codons is the multiplicity of three-base pair codon combinations that specify an amino acid. For example, the codons GAA and GAG both specify glutamic acid and exhibit redundancy, but neither specifies any other amino acid.

The degeneracy of the genetic code accounts for the existence of synonymous mutations. A practical consequence of redundancy is that some errors in the genetic code cause only a synonymous mutation or an error that would not affect the protein because the hydrophilicity or hydrophobicity is maintained by the equivalent substitution of amino acids. For instance, a codon of NUN (where N = any nucleotide) tends to code for hydrophobic amino acids, while NCN yields amino acid residues that are small in size and moderate in hydropathy.

The bias in codon usage is a universal feature of genomes. This bias may be explained by the slippage hypothesis, which allows for the elongation and shortening of DNA repeat sequences. The codon bias is more pronounced for serines in regions of tandem repeats.

Start and termination codons

The codon AUG, which encodes the amino acid methionine, is known as the start codon. It is the first codon translated into the transcribed mRNA. During protein synthesis, tRNA recognises the initiation codon AUG and initiates mRNA translation with the help of initiation factors. The start codon marks the beginning of the process of translation, where the sequence of mRNA is read and translated into a string of amino acids for protein synthesis.

The start codon AUG sets the reading frame for translation. The ribosome attaches to the mRNA strand and finds the beginning of the genetic message, called the start codon. The specific tRNA molecule that carries methionine then recognises and binds to this codon, completing the initiation phase of translation.

The codons UAA (Ochre), UAG (Amber), and UGA (Opal) are known as stop codons and they signal the termination of translation. They are also referred to as nonsense codons because they do not code for amino acids. Instead, they mark the end of the polypeptide chain during translation. During protein synthesis, these stop codons cause the release of new polypeptide chains from the ribosome.

In special cases, the stop codons UGA and UAG can encode a 21st and 22nd amino acid, respectively. These amino acids are selenocysteine and pyrrolysine.

Transfer RNA (tRNA)

The tRNA molecule has a distinctive folded structure with three hairpin loops that form a three-leafed clover shape. One of these loops contains a sequence called the anticodon, which can recognize and decode an mRNA codon. Each tRNA molecule has a corresponding amino acid attached to one end. When a tRNA molecule recognizes and binds to its matching codon in the ribosome, it transfers the correct amino acid to the end of the growing amino acid chain. This process is known as translation, and it synthesizes a protein from an mRNA molecule.

The structure of tRNA can be divided into its primary, secondary, and tertiary structures. The secondary structure is often visualized as a cloverleaf, and the tertiary structure is an L-shaped 3D structure that allows tRNA to fit into the P and A sites of the ribosome. The top half of the tRNA molecule, including the T-arm and acceptor stem, and the bottom half, including the D-arm and anticodon arm, are independent units in both structure and function.

The evolution of type I and type II tRNAs is explained by the three 31-nucleotide minihelix tRNA evolution theorem, which also describes the pre-life to life transition on Earth.

Translation and initiation

The process of translation and initiation is a complex one, involving multiple steps and components. Translation initiation is a critical step in the synthesis of several proteins. It involves the interaction of the mRNA with the ribosomal small subunit, along with other essential components. This process differs between prokaryotes and eukaryotes.

In prokaryotes, translation initiation involves the direct interaction of the ribosomal RNA with the mRNA. This interaction sets the stage for the subsequent synthesis of proteins. On the other hand, eukaryotes have evolved a more intricate mechanism that relies predominantly on protein-RNA and protein-protein interactions. The evolution of novel mRNA structures, such as the 5' cap and the poly (A) tail, has enabled eukaryotes to develop new mechanisms for ribosome recruitment to the mRNA.

The initiation of protein synthesis, in general, involves the assembly of the catalytic rRNA and an initiator tRNA at the correct AUG codon of a template mRNA. This AUG codon is typically recognised as the signal for the start of protein synthesis, also known as the initiation codon. The initiator tRNA plays a crucial role in this process, ensuring the accurate incorporation of amino acids into the growing protein chain.

The process of translation initiation in eukaryotes is particularly complex, requiring the coordination of multiple factors. At least eleven different initiation factors are needed to properly initiate translation in eukaryotes. These factors work together to ensure the accurate assembly of the initiator tRNA, ribosome, and mRNA. The eIF4F complex, composed of several subunits, plays a key role in recognising the 5′ methyl-7-guanosine (m7G) cap on the mRNA, marking the start of the initiation process.

The final stages of initiation involve the joining of the 40S and 60S ribosomal subunits, forming the complete 80S ribosome. This assembly allows for the ejection of most eIFs and the initiation of translation elongation, with the formation of the first peptide bond. The intricate process of translation and initiation is a highly regulated and coordinated effort, ensuring the accurate synthesis of proteins according to the genetic code.

The evolution of the genetic code

The genetic code is nearly universal, and the arrangement of codons in the standard codon table is highly non-random. The evolution of the genetic code is a complex topic with several theories and ongoing research.

The three main concepts on the origin and evolution of the code are the stereochemical theory, the coevolution theory, and the error minimization theory. The stereochemical theory suggests that codon assignments are determined by the physicochemical affinity between amino acids and the cognate codons (anticodons). The coevolution theory proposes that the code structure evolved alongside amino acid biosynthesis pathways. The error minimization theory posits that the principal factor in the code's evolution was the selection to minimize the adverse effects of point mutations and translation errors.

These theories are not mutually exclusive and are compatible with the frozen accident hypothesis, which suggests that the standard code might not have any special properties and was established simply because all life forms share a common ancestor. Changes to the code after this initial establishment were likely prevented due to the harmful effects of codon reassignment.

Mathematical analysis of the structure and potential evolutionary trajectories of the code reveals that while it is robust to translational misreading, there are many more robust codes. This suggests that the standard code could have evolved from a random code through a series of codon series reassignments.

In summary, the evolution of the genetic code is a multifaceted topic with contributions from various factors, including historical accidents, evolutionary forces, and chemical constraints. While there are several theories and hypotheses, a comprehensive understanding of the origin and evolution of the genetic code remains a subject of ongoing research.

