Information with a Twist

How DNA Became Nature's Hard Drive

More Than Just Genes

Imagine storing all the information from every movie, book, and website ever created in a container no larger than a sugar cube.

This isn't science fiction—it's the revolutionary potential of encoding digital information in DNA, where a single gram can store 215 petabytes of data. This remarkable application represents just one frontier in the expanding field of biological information science, where the fundamental concepts of information theory intersect with cutting-edge biotechnology.

For decades, scientists have understood that DNA operates much like a sophisticated programming language, complete with encoding, error-correction, and replication systems. What makes this possible is that all known life depends on information—specifically, coded information systems that share remarkable properties with human-designed communication technologies.

Today, companies like Twist Bioscience are leveraging these principles to transform how we store data, develop therapeutics, and understand biological systems. By applying information theory to biology, scientists are learning to read, write, and edit the very code of life itself, opening possibilities that range from anti-idiotypic antibodies that combat disease to designer enzymes that revolutionize manufacturing ³ .

DNA Data Storage

Revolutionary potential for compact, long-term information storage

Information Theory

Mathematical framework for understanding biological information systems

Biotechnology

Applications ranging from therapeutics to manufacturing

The Language of Life: Key Concepts in Biological Information

What is Information Theory?

The mathematical foundation for understanding information was established by Claude Shannon, whose 1948 work "A Mathematical Theory of Communication" laid the groundwork for modern information science.

In Shannon's model, information communicates a decision between alternatives. Each symbol in an alphabet of possibilities can provide a measurable quantity of information, with the basic unit being the bit—which represents a decision between two equally probable choices ⁴ .

Shannon introduced the concept of entropy (H) as a measure of uncertainty in information systems. Maximum entropy occurs when symbols are equally probable (creating the most uncertainty), while zero entropy indicates the same message is always produced (no uncertainty).

From Theory to Biology: Information in Life

When information theory meets biology, we encounter fascinating questions: Does a bacterium that produces five similar variants of a protein possess more information than one producing a single highly-tuned version?

Scientists like Hubert Yockey and Kirk Durston pioneered the application of Shannon's theories to biology, with Durston introducing the concept of "functional information"—the difference between the entropy of all possible messages and the entropy of those messages that provide a specific biological function ⁴ .

This framework helps us understand why some proteins tolerate variations in their sequence while others remain virtually unchanged across billions of years of evolution.

Information Theory Timeline

1948

Claude Shannon publishes "A Mathematical Theory of Communication"

1970s

Hubert Yockey applies information theory to molecular biology

2000s

Kirk Durston introduces concept of "functional information"

Present

Information theory applied to DNA data storage and synthetic biology

The Experiment: Designing Better Biological Systems

Background: One Factor at a Time vs. Systematic Design

To understand how scientists optimize biological systems, consider a classic experimental challenge: maximizing the yield of a chemical process where temperature and pH are key factors.

The intuitive approach—One Factor At a Time (OFAT)—involves varying temperature while keeping pH constant, finding the optimal temperature, then varying pH while keeping temperature at this "optimal" setting. This method seems straightforward but contains a critical flaw: it cannot detect interactions between factors ² .

In our example, an OFAT approach might identify temperature at 30°C and pH at 6.0 as "optimal," yielding 86%. However, this approach misses the twisting relationship between these variables.

Methodology: A Designed Experiment

Design of Experiments (DOE) provides a powerful alternative framework. For our two-factor scenario, researchers would:

Define the experimental region - establishing low and high levels for both temperature and pH
Select treatment combinations - testing the "corners" of the experimental space
Add center points - allowing detection of curvature in the response surface
Include replicates - repeating at least one treatment combination
Randomize run order - performing tests in random sequence

This approach systematically explores the entire experimental space while requiring far fewer resources than testing every possible combination ² .

Results and Analysis: Discovering Hidden Relationships

The DOE approach revealed a dramatically different relationship between temperature, pH, and yield than the OFAT method detected. The maximum yield (91%) occurred at 45°C and pH 8—significantly higher than the 86% identified through OFAT.

Experimental Approach	Runs Required	Maximum Yield Found
OFAT	13	86%
DOE	12	91%

Temperature	pH	Predicted Yield
30°C	6.0	86%
45°C	8.0	91%
45°C	7.0	92%

Analysis of Variance Components

Temperature main effect: 28%

pH main effect: 31%

Temperature*pH interaction: 37% (detected by DOE only)

Curvature: 4% (detected by DOE only)

The Scientist's Toolkit: Essential Research Reagents

Modern biological research depends on specialized tools and reagents that enable precise interrogation and manipulation of biological information.

Reagent/Tool	Primary Function	Key Applications
Twist Standard Hybridization Reagent Kit	Enables efficient and specific binding of panel probes to target DNA regions	Target enrichment for next-generation sequencing; whole exome sequencing ⁶
Anti-Idiotypic Antibodies	Bind specifically to the unique antigen-binding region (idiotype) of therapeutic antibodies	Pharmacokinetic (PK) assays; anti-drug antibody (ADA) assessments; quality control ³
Custom Antigens	High-quality proteins designed to specifically trigger immune responses	Antibody discovery campaigns; vaccine development; diagnostic assays ³
Twist Library Preparation Kits	Fragment DNA and add sequencing adapters for next-generation sequencing platforms	Whole genome sequencing; comprehensive genomic profiling; population genomics ⁹
Trinity Freestyle Fast Hybridization Kit	Rapid 1-hour hybridization for on-flow cell enrichment, eliminating multiple steps	Fast next-generation sequencing workflows; reduced hands-on time ⁹

Toolkit Evolution

These tools enable researchers to not only read biological information but to write and edit it as well. For instance, anti-idiotypic antibodies serve as critical reagents for monitoring therapeutic antibodies in clinical development, ensuring their safety and efficacy.

The sophistication of these tools continues to evolve. Recent collaborations between companies like Twist Bioscience and Element Biosciences have produced integrated workflows that reduce sample-to-sequencer time to just five hours while maintaining high accuracy across diverse applications ⁹ .

Conclusion: The Future of Biological Information

The intersection of information theory and biology represents one of the most promising frontiers in science.

What began with Shannon's mathematical insights into communication systems has evolved into a sophisticated framework for understanding the fundamental programming of life itself. From designing better experiments that reveal hidden relationships between biological factors to developing reagents that manipulate molecular interactions with exquisite precision, the principles of information science are transforming biotechnology.

DNA Data Storage

Could preserve humanity's knowledge for millennia

Synthetic Biology

May create sustainable alternatives to traditional manufacturing

Personalized Medicine

Might tailor treatments to our individual genetic makeup

The "twist" in our story is both conceptual and corporate—the recognition that information isn't merely descriptive but fundamental to biological function, and the growing capacity of companies like Twist Bioscience to harness this understanding. As we continue to unravel nature's information systems, we move closer to solving some of humanity's most pressing challenges in health, sustainability, and technology.

The future of biological information science promises not just to transform how we understand life, but to enhance life itself through targeted interventions designed with informational precision.

Information with a Twist

More Than Just Genes

DNA Data Storage

Information Theory

Biotechnology

The Language of Life: Key Concepts in Biological Information

What is Information Theory?

From Theory to Biology: Information in Life

Information Theory Timeline

1948

1970s

2000s

Present

The Experiment: Designing Better Biological Systems

Background: One Factor at a Time vs. Systematic Design

Methodology: A Designed Experiment

Results and Analysis: Discovering Hidden Relationships

Analysis of Variance Components

The Scientist's Toolkit: Essential Research Reagents

Toolkit Evolution

Conclusion: The Future of Biological Information

DNA Data Storage

Synthetic Biology

Personalized Medicine

References

References