The Mirror-Image World of Biology and Its Potential for DNA Data Storage

Researchers find that L-DNA is a stable tool for the long-term storage of information.

In 2020, every person created about 1.7 megabytes of data per second. The astronomical amount of data created around the globe uses excessive energy and relies on storage systems that are not designed to last long-term. Researchers are left with an important question: How can we reliably and sustainably store data in an increasingly digital world?

One solution may lie within nature, as all living organisms use DNA to store massive amounts of information. For example, the human genome contains approximately 3.2 billion base pairs that encode all the information we need to grow and survive. Every cell in the human body contains a copy of the entire human genome, and we have evolved incredible systems for maintaining, copying, and storing this DNA.

The powerful storage capacity of DNA is not lost on scientists and engineers. For years, researchers have evaluated the promise of DNA as a storage device to meet future data demands. Although DNA is one of the most stable molecules in nature, it is not immune to degradation. Many people are rightly concerned about the reliability of long-term storage in biological systems.

Scientists recently proposed an elegant solution to this problem in the journal Nature Biotechnology. Fan et al. report a system that can translate text passages into l-DNA, the mirror-image of the d-DNA found in the natural world. Stable even after a year of incubation in pond water, this synthesized version of DNA may be a long-term solution for data storage on a massive scale.

Why Store Data in DNA?

As we deal with an overwhelming amount of data, DNA is an attractive storage solution due to its remarkable compactness. Scientific American recently provided an excellent example of this, where DNA storage could pack all the data on Facebook into half of a poppy seed.

The typical lifetime of archival storage for electronic data is a few decades. DNA improves on this with the potential to last for centuries or even millennia based on the recovery of DNA from well-preserved fossils.

Another issue with traditional data storage is the time spent copying information from one medium to another as technology changes and improves. You may have found it annoying to transition your movies from VHS to DVD to digital copies. Now imagine needing to copy entire archives worth of information to keep data accessible and well-preserved. It can be a time-consuming and tedious process.

In contrast, as DNA sequencing technology improves—thanks to healthcare and research—DNA does not need to change. The technology itself becomes more sensitive and accurate for reading the same base material. With no necessary changes to DNA, archiving would not require frequent translating of information to other media. Thanks to a polymerase chain reaction (PCR) and other available molecular biology tools, it is also cheap and fast to copy sequences of DNA if required. Plus, software already exists to translate the binary language of computers into the four-letter code of DNA.

The concept of DNA as a storage device is therefore not new. Norbert Wiener and Mikhail Neiman first proposed the concept of using DNA to store information in the mid-1960s. However, the idea was not demonstrated until the 1990s, with an art installation by Joe Davis that used DNA and living cells to encode a 35-bit image. Since then, researchers have continued to improve on existing systems for writing, storing and reading DNA sequences.

A history of significant publications on the digital data storage capacity of DNA (Ceze et al., Nat Rev Genetics, 2019).

A history of significant publications on the digital data storage capacity of DNA. (Ceze et al., Nat Rev Genetics, 2019.)

So why hasn’t DNA as a data storage tool become more broadly popular? Well, despite being stable under ideal conditions, DNA is still readily degraded by heat, light and even the passage of time. Although this is a valuable trait in the natural world, people are concerned about long-term data storage in DNA. Even if the molecule can overcome our issue with the physical space required to store information, it is not a complete solution if that information is not reliably recoverable.

The Potential of a Mirrored World

Over 160 years ago, Louis Pasteur discovered molecular chirality, the property of many biologically relevant molecules that gives them handedness. Like windows, gloves and scissors, molecules can be handed and oriented in either a right or left-handed configuration. Handedness has significant consequences for living organisms. Like DNA, the amino acids that make up proteins, sugars and more are all chiral molecules with handedness. In discovering that molecules can exist in two distinct conformations, Pasteur proposed a mirror-image world of biology, where molecules are found in both orientations. However, this mirror-image of nature has not been discovered in the environment and cannot be replicated in a lab.

To better understand this concept, consider a right and left hand. Both are mirror images of each other, but your fingers and thumbs don’t line up if your place your right hand over your left hand. Many molecules work the same way and living organisms can only function using a specific orientation of chemicals. Just like how you can’t fit your right hand into a left-handed glove, cells cannot function unless their molecules are in the correct orientation.

The two mirror images of a chiral molecule are usually designated as right- or left-handed, shortened to D or L respectively. All molecules found in the natural world are in the D form, including d-DNA, d-glucose and more. Researchers can now use chemical synthesis to create the L form of many molecules, including DNA, in the lab.

Diagram showing the ‘handedness’ of a chiral molecule. (Image courtesy of the University of Oxford Mathematical Institute.)

Diagram showing the ‘handedness’ of a chiral molecule. (Image courtesy of the University of Oxford Mathematical Institute.)

l-DNA is not found in the natural world, making it highly resistant to degradation and a stable form of information storage. That is precisely what Fan et al. found in a recent Nature Biotechnology paper expanding on the potential of l-DNA as an information storage strategy.

Using l-DNA as a Stable and Efficient Information Storage System

Despite the promise of l-DNA as a storage molecule, synthesizing mirror images of molecules in a lab is expensive and time-consuming. While d-glucose is a cheap staple of most labs, l-glucose is expensive in even small quantities. So, how can we make l-DNA affordably and efficiently to meet modern digital storage requirements?

The answer is once again found in nature, where an enzyme, or protein, called DNA polymerase can join individual bases to form long chains of DNA. In molecular biology labs, scientists can use DNA polymerase to create DNA of any sequence they desire. However, DNA polymerase enzymes found in nature only make d-DNA. This means a synthetic mirror-world DNA polymerase needs to be created from scratch in the lab to make l-DNA.

In their recent paper, Fan et al. did exactly that, creating a mirror-image version of the high-fidelity Pfu DNA polymerase capable of synthesizing l-DNA. This was no small feat. The synthesis required optimizing the protein sequence so that it was cheaper and easier to make without affecting its function. The authors ultimately made 15 individual subunits of the enzyme, which formed the final protein when combined.

Mirror-image Pfu DNA polymerase (Fan et al., Nature Biotechnology, 2021).

Mirror-image Pfu DNA polymerase. (Fan et al., Nature Biotechnology, 2021.)

The Pfu DNA polymerase is commonly used in molecular biology labs for PCRs and is heat stable and accurate. The authors used this system to translate a paragraph written by Louis Pasteur in 1860 that proposed a mirror-image world of biology. The l-DNA synthesized using this system was stable after one year of incubation in pond water, which degrades conventional d-DNA within one day. Collectively, this system represents a promising strategy for bioorthogonal information storage.

What’s Next for Storing Information in Biological Systems?

Challenges still exist for information storage in DNA, as both the right- and left-handed versions of the molecule are sensitive to heat, light and time. With additional tools, researchers hope to realize a future where the compact nature of DNA can be exploited as an efficient storage strategy for information.

Beyond research, several companies, including Microsoft and Illumina, formed the DNA Data Storage Alliance earlier this year. The goal of the tech group is to accelerate the development of DNA as a storage molecule by advancing molecular biology techniques.

Current storage strategies rely on electronic storage in devices such as hard drives and flash drives. The compactness of DNA allows it to be six orders of magnitude denser than other storage media available today. If we can overcome stability issues with l-DNA and continue to lower the cost of sequencing, DNA could be an attractive solution to the global storage crisis.