Doubling the DNA alphabet

Implications for life in the universe and DNA storage

Life on Earth is dictated by the DNA alphabet comprised of only four DNA bases or letters: A, T, G and C. It has long been of interest to understand whether there is something very special about the four letters that comprise DNA and whether this is the only code that could support life. At a basic level, this question can be addressed by examining an expanded alphabet and determining the properties of DNA including additional synthetic letters. This study impacts our current understanding of terrestrial DNA and suggests that extraterrestrial life forms could have evolved using a different genetic code than found here on Earth. The work has immediate applications in synthetic biology for the creation of new molecules and greatly expands the ability to store information in DNA.

Now, in breakthrough work, funded by NASA, NSF and NIGMS, Dr. Steven Benner at the Foundation for Applied Molecular Evolution, in collaboration with Dr. Millie Georgiadis at the Indiana University School of Medicine, and colleagues at biotechnology companies and other universities, have provided evidence that the standard DNA code can be expanded to include eight letters forming “hachimoji DNA” (“hachi” eight and “moji” letter in Japanese) using four novel synthetic nucleobases (B, S, P and Z) in addition to A, T, C and G and still retain critical features of natural DNA1,2. Structurally, hachimoji DNA can adopt a standard double helical form of DNA and retain Watson-Crick complementary base pairing, which allows the expanded DNA to be faithfully replicated and transcribed by polymerases to produce hachimoji DNA copies and hachimoji RNA. These properties are essential for a genetic system that can support life.

Image: Crystal structure of a double helix built from eight hachimoji building blocks, G (green), A (red), C (dark blue), T (yellow), B (cyan), S (pink), P (purple), and Z (orange). The first four building blocks are found in human DNA; the last four are synthetic, and possibly present in alien life. Each strand of the double helix has the sequence CTTAPCBTASGZTAAG. Notable is the geometric regularity of the pairs, a regularity that is needed for evolution.