By feeding strings of human-written data into colonies of bacteria, scientists have discovered a way to turn tiny cells into living, squirming hard drives.

A team of Harvard scientists led by geneticist Seth Shipman has just developed a fascinating way to write chunks information into the genetic code of living, growing bacterial cells. It could be the code for a computer program or the lines of a poem. Either way, these living memory sticks can pass this data onto their descendants, and scientists can later read that data by genotyping the bacteria. As Shipman explains in a paper today in the journal Science, his method can upload roughly 100 bytes of data.

Advertisement – Continue Reading Below

Bacterial Bytes

To be clear, scientists have already proven they can synthetically manufacture DNA in the lab and write into it pretty much whatever they want, including a full-length book on science. "But working within a living cell is an entirely different story and challenge," says Shipman. "Rather than synthesizing DNA and cutting it into a living cell, we wanted to know if we could use nature's own methods to write directly onto the genome of a bacterial cell, so it gets copied and pasted into every subsequent generation."

Before Shipman's experiment, the most information any scientist had ever permanently uploaded into a living cell was 11 bits of information. That's a mere 11 zeros and ones of binary data, and less information than your computer requires to code for two alphabetic letters. Shipman's technique expanded this record to roughly 100 bytes of data. For reference, it would take your computer precisely one hundred bytes to encode this very sentence.

"We wanted to know if we could use nature's own methods to write directly onto the genome."

Shipman used a fascinating immune response that certain bacteria have to protect themselves against viral infection. In the parlance of geneticists, this response is called the CRISPR/Cas system, and it's actually quite simple. Basically, when these bacteria are invaded by viruses, they can physically cut out a segment of the attacking virus's DNA and then paste it into a specific region of the bacteria's own genome. This allows a bacteria to remember what a certain virus looked like in case it ever tries to invade again. Not only that, but this genetic memory is passed on to the bacteria's progeny, transferring the viral immunity to future generations.

Shipman's team found that as long as you introduce a segment of genetic data that looks like viral DNA to a colony of bacteria carrying this CRISPR/Cas system, the bacteria would gobble it up and incorporate it into their genetic code. To turn a colony of bacteria into a jumble of tiny hard drives, all Shipman had to do was disperse loose segments of faux-viral DNA into a colony of E. coli bacteria that had the CRISPR/Cas system. The DNA segments these scientists used were actually just arbitrary strings of data—say, secret messages written in the A, T, C, G, nucleotide genetic letters of life—that were book-ended with chunks of real virus DNA. Shipman introduced one segment of information at a time and let the bacteria do the rest, storing away information like fastidious librarians.

Bugs in the Hard Drive

The team was helped by an important fact about CRISPR. The bacteria store their new immune system memories sequentially, so that viral DNA from earlier infections are recorded before those of more recent infections. "That's quite important," Shipman says. "If the new information was just stored randomly, that wouldn't be nearly as informative. You'd have to have tags on each piece of information to know when it was introduced into the cell. Here it's ordered sequentially, like the way you write down the words in a sentence."

There's just one complication. When Shipman introduces coded messages of viral DNA to his bacteria,  not all of the bacteria eat up the message. By chance, some miss it. So if you were to introduce word-by-word the code for the sentence "This Message Is In Your Genes," using six introductions of viral DNA, not all the bacteria would have the complete message. Some would have "This In Genes," while others might only have "Is Genes," and so on. Even with these "errors," Shipman says, you can rapidly genotype a few thousand or million bacteria in a colony and, because the message is always recorded sequentially, deduce what the full message was with crystal clarity. It's like playing a game of DNA telephone.

Shipman says the 100 bytes his team demonstrated is nothing near the limit. Certain cells, like the microorganism Sulfolobus tokodaii would have room for more than 3,000 bytes of data. And with synthetic engineering, it's not hard to imagine certain specially designed hard-drive bacteria with vastly expanded regions of their genetic code, able to rapidly upload vast amounts of data.