This is a more speculative post, but I had a thought which I want to explore. I haven’t worked through it rigorously, so take what follows with a grain of salt because there may be algorithms I don’t know about that render one or more parts of it moot.
The core requirement of cryptographic schemes is that they be easy to compute given all the right ingredients, but functionally impossible to compute otherwise. Typically this is accomplished via functions which can be computed very quickly, but can’t be inverted by any known P-time1 algorithm. Quantum computers thoroughly bork that for the classic approach used in most security systems, but post-quantum cryptographic methods are emerging to address this vulnerability.
An intuitive example of this was given by Steven Rudich, a computational complexity theorist, who phrased it as “I can recognize great music, but I can’t create great music.”2 While it can sometimes be a questionable endeavor to extend metaphors with additional metaphors, that quote combined with a recent bit of news to spark a thought for me.
This past week, the United Kingdom’s ARIA (Advanced Research & Invention Agency) posted a draft program thesis for the program which I’ve been most eagerly waiting to hear more about. While the announcement of this cohort of program directors included mention of some early bits of the idea under the info on Alex Obadia, the director of this particular program, we now have a bit more detail. Titled “Trust Everything, Everywhere”, it proposes to extend security and trust systems to new contexts and substrates–many of which have previously seen little to no attention outside of small groups of researchers.3 Among those they list include non-biological materials, DNA, and brains. Neurosecurity being one of my main interests, that topic is addressed elsewhere on this blog, and will continue to be in future posts. The idea of using DNA for security was new to me, however.
Most of the advantages emphasized in research on DNA cryptography are linked to the advantages of DNA computing more broadly.4 Per unit mass, DNA offers extremely high information density, enabling more efficient storage. It’s also highly stable, and because DNA is soluble the volumetric density of parallel computations can be impressively high. While less of a computational advantage and instead more of a unique feature, DNA computing–and thus equally so a DNA cryptography schema–could be embedded in living organisms. On the delightfully cinematic extreme, one could imagine a spy flick with a dead drop where one spy empties a sample of water with bacteria into a puddle, then later another spy collects a sample of water from that puddle, takes it to the DNA sequencer back at base, and applies a secret key to the various genomes found in the puddle until they identify the message encoded in one strain of bacteria via DNA steganography.5 I suspect nerds everywhere would be swooning over James Bond, PhD.
However, there doesn’t appear to be much research related to leveraging the computational properties of DNA as it’s used in organisms. What immediately came to my mind is the possibility of leveraging this to generate functions which are easy to “compute” by mixing all the right ingredients, but virtually impossible to solve without them.
Take, for example, the CRISPR-Cas9 system. From a computational perspective, this is a highly flexible function which bacteria evolved for computing on viral genomes. Now, in the wild these “computations” are actually just neutralizing a virus, as CRISPR functions as a sort of immune system for many species of bacteria. But the generality of the system, especially as applied already in genetic engineering research, produces pathways towards exactly the kind of combinatorial explosion in possible solutions which one wants from a good cryptographic scheme.
Let’s imagine, for example, a set of DNA sequences which encode (among other things) all the things we need to generate CRISPR-Cas9 apparatuses with specially designed guide sequences.6 By having certain guide sequences target others, we could design the guides and the rest of these sequences such that adding them to a solution in different orders would produce different final outputs. For example, suppose the following set of interactions:
A inactivates B
B inactivates C
C inactivates A
D inactivates B
Suppose that the output of the function is chosen to be “Which of A, B, and/or C are intact following the sequence of additions?”
For sequence D, B, C, A, the output is C.
Because D is present, when B is added, B gets inactivated. Since B is inactivated, it doesn’t affect C when the latter is added. C inactivates A, so when A is added, it gets inactivated. Leaving only C among the sequences we’ve chosen as our outputs.
For sequence C, B, A, D, the output is A. B inactivates C, leaving A to inactivate B. D then does nothing, since B is already inactive.
Generalizing this, even with simple one-to-one interactions we have N! possible functions for N elements. The “order of operations”, so to speak, means that the specific order which outputs the desired solution can serve as a cryptographic key.
However, one could flip this around so that the order itself is the output of the function. How would this work? Well, suppose you have some collection of sequences from which one or more are missing. You’re given the first and last sequences in the order, and want to compute the rest. If you knew all the sequences, my intuition is that there’d be a P-time graph algorithm to find an ordering (essentially a path through the graph) which would get you from the first entry in the list to the last entry. However, absent one or more missing DNA sequences, you’re back to N! possibilities. It’d likely be feasible to narrow down the space, but properly designed the set of potential orderings would probably remain quite high. The asymmetry between the problem of finding the (or an) order given all sequences vs finding the order given only some sequences might make for a viable cryptographic function.
But suppose we weren’t content with that. Well, transcription factors can alter the expression probability of a given section of a DNA sequence, and their inclusion takes us from the realm of graphs7 to that of hypergraphs8. Rather than the more binarized operations of the preceding example, transcription factors will primarily impact the probability of a given bit of DNA sequence being active, meaning that we have another lever (or some large quantity of them) for encoding distributional information.9
Now imagine we take our engineered transcription factors and design small molecule inhibitors for one or more of them. This introduces another form of key beyond the ordering keys discussed above. The space of small molecule structures and their interactions with transcription factors puts N! to shame because the possible geometries are a continuous function. That doesn’t mean there’s an infinite number of solutions in practice, but it makes the space of possible solutions delightfully vast.
So we have multiple types of functions, multiple types of data, and multiple forms of keys with which to encrypt them–all via repurposing things evolution has already engineered DNA and proteins to be able to do.
Potentially a useful tool for “Bond. Noncovalent Bond.”10
As always, eager to hear your thoughts. If you’re interested in this topic, feel free to reach out via the contact page on this site.
Footnotes:
- For those unfamiliar with computational complexity, P-time means that an algorithm will at worst run in an amount of time which scales with some polynomial function of the problem size. For example, an algorithm which sorts a list of N numbers by swapping them pairwise if they’re out of order will, at worst, take ~N^2 operations. This is inefficient compared with the N*log(N) algorithms which exist, but very fast compared with the decidedly not P-time algorithm of exhaustively trying all combinations, which would take ~N! operations. ↩︎
- https://blog.computationalcomplexity.org/2005/03/pnp-and-arts.html ↩︎
- https://www.aria.org.uk/opportunity-spaces/trust-everything-everywhere ↩︎
- Anam et al, 2010 (https://spj.science.org/doi/full/10.34133/icomputing.0106#sec-4); Chu et al, 2025 (https://spj.science.org/doi/full/10.34133/icomputing.0106#sec-4); Niu et al, 2019 (https://link.springer.com/chapter/10.1007/978-981-15-3415-7_11); Mondal and Ray, 2023 (https://arxiv.org/pdf/1904.05528) ↩︎
- https://en.wikipedia.org/wiki/Steganography ↩︎
- https://news.stanford.edu/stories/2024/06/stanford-explainer-crispr-gene-editing-and-beyond ↩︎
- “Graphs” in the mathematical sense, i.e. networks where any given link is between two nodes. ↩︎
- Instead of links being between two nodes, a single link can simultaneously be between up to K nodes for a K-graph. This doesn’t lend itself to pictures as well, but if you think of a network as defined by pairs of nodes joined by a link, e.g. some collection of pairs (i,j), a K-graph’s links are just sets (i,j,…,m) such that there are K numbers inside the parentheses. ↩︎
- By nature of how molecular biology works, even the direct inactivation route is still probabilistic, but because it isn’t invertible over time it can get close to deterministic. By contrast, transcription factors can associate, dissociate, reassociate, and so on with the same sequence, meaning at any given time you just have a certain probability of a given copy of a sequence having its expression altered. ↩︎
- I’m sorry, I couldn’t help myself. ↩︎

