Trending: Bill Gates outlines 3 steps US government needs to take ‘to save lives and get the country back to work’
DNA strand displacement tool
A 3-D animation shows how DNA molecules can be used in computational devices. (Credit: Microsoft Research)

Data storage is getting better and better, but the final frontier for the long-term preservation of digital bits may well be DNA molecules – and the University of Washington and Microsoft Research are trying to make it so.

The work on DNA data storage architecture is one of the angles in today’s New York Times story on the subject. In a paper prepared for an international conference on software architecture, researchers propose an error-tolerant encoding scheme for reading out the data in a DNA-based storage system.

Such a system would take advantage of DNA’s amazing information storage capability – the kind of capability that’s able to hold all the genetic code for any organism in a single cell. The Times notes that all of the world’s digital information could be stored in about 2.4 gallons (9 liters) of solution, which would fit inside a typical water cooler bottle.

The benefits of such a system not only include being able to put a lot of data in a small space, but also being able to preserve the data for millennia under the right conditions. Silicon-based storage media degrade over time, but if data is stored in DNA, “all you have to do is keep it cold and dry,” Microsoft computer architect Karin Strauss is quoted as saying.

The University of Washington and Microsoft have partnered with Twist Bioscience in San Francisco to develop a low-cost system for synthesizing custom-encoded DNA on a massive scale. (For more about what Twist is doing, check out last month’s report from Wired.)

They’re not the only ones delving into the DNA data frontier. Researchers at the University of Illinois are also working on a rewritable, random-access DNA storage system. As a proof of concept, they encoded parts of the Wikipedia pages for six universities, then edited those entries using gene sequencers and chemical primers. “The current drawback of our scheme is high cost, as synthesizing long DNA blocks is expensive,” the researchers said.

The European Bioinformatics Institute is also a pioneer in the field. In 2013, scientists at the British-based institute used DNA to encode all 154 of Shakespeare’s sonnets, plus a grab bag of documents, audio and imagery.

The trailblazer in DNA data storage is arguably a research group at Harvard led by geneticist George Church. The Harvard researchers used DNA to encode the content of “Regenesis,” a book written by Church and Ed Regis – including 53,426 words of text and 11 images, plus the code for a Javascript app.

Now Church and his colleagues are working to encode the data for an entire movie, the 1902 French silent-film classic titled “A Trip to the Moon,” into industrial-strength synthetic DNA. The project is being funded by Technicolor, which wants to use DNA data storage systems to preserve films for its customers in the movie industry.

Church’s choice of the movie to be encoded is apt, and not just because it’s short. Strangely enough, the moon has long been talked about as the potential site for an ultra-long-term backup data storage.

This story has been updated to give more credit to University of Washington researchers, including Luis Ceze and Georg Seelig.

Subscribe to GeekWire's Space & Science weekly newsletter


Job Listings on GeekWork

Executive AssistantRad Power Bikes
Find more jobs on GeekWork. Employers, post a job here.