Skip to main content

Data & Intelligence

Thoughts on DNA and Big Data

DNA – DeoxyriboNucleic Acid – is a molecule that encodes the genetic instructions used in the development and functioning of all known living organisms… this is the most common text book definition. Soon, a second definition may need to be added to the dictionary, one where DNA equals ‘biological’ storage.

Thoughts on DNA and Big DataAs a matter of fact, scientists have been studying DNA as a possible storage device for a while now… and recently, a team at Harvard’s Wyss Institute has successfully stored about 700 terabytes of data in a single gram of DNA – treating DNA as just another storage device.

How does this work? Well, simply put, strands of DNA that store 96 bits are synthesized, with each of the TGAC bases representing binary values. Stored data is read by being sequenced and converted back into binary – see ExtremeTech article  for the details.

DNA would be extremely good as a potential storage medium because:

  1. it’s incredibly dense
  2. it’s volumetric, as opposed to an hard disk that is planar
  3. it’s incredibly stable

So, what’s the catch? Well, the speed of the data retrieval with DNA Storage is not that great. However, “it is fast enough for very-long-term archival”, according to studies conducted to date. I am sure that data retrieval time will improve over time, even if we are not quite there yet. In the future, a quantum computation approach may be used to speed things up…. but I don’t want to speculate too much at this point.

One thing is a definite though, we’ll soon get to a place where it will be more expensive for a company not to store data than to store data – will companies be ready for that time? When we look at the numbers it is easy to see how ‘Big Data’ not only is a big deal now, but it will be a much greater deal as time goes by.

Nowadays, people create 2,500,000,000,000,000 bytes of data per day. In addition, more than 90% of the world’s data has been created over the last few years only. When we consider that one gram of DNA can store 700 terabytes of data – to store the same kind of data on hard drives we’d need 233 3-TB drives (weighing a total of 300+ pounds) – it is easy to see that we are about to get to a place where ‘biological storage’ allows us to record anything and everything, all the time… and cheaply.

How are you and your company getting ready for the time when we’ll get to the promised land of unlimited and cheap storage? What is your ‘Big Data’ vision for your company?

cTcaGaaTccTcaacacTGcGGcaaTGaTGGTaTGaTTTTcTGaTaTGaaTaaaaccGGccTTccTGcGGGGcGGGaaTccGGGaTGGaTTcacaGaTTTacTcaTGaaGacaaGaaccaaTTTcTcaaTGaGTGTccGaaccaaGGTTaaTaGGaTTTGcTGGcTcGcGGGaaGaccTaccacaTGaccTTaTTaGGGGcTTcTTcGcGTaTGaTcGTccTaTcGTcTTGaaGGTcGaacTTTaaTG  (TRANSLATED into plain English – I look forward to your comments!)

To have some fun, translate your text to DNA-coding here – courtesy of Dave.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Andrea Serafini

More from this Author

Follow Us