More well thought out work can be found at — https://axial.substack.com/
Axial partners with great founders and inventors. We invest in early-stage life sciences companies often when they are no more than an idea. We are fanatical about helping the rare inventor who is compelled to build their own enduring business. If you or someone you know has a great idea or company in life sciences, Axial would be excited to get to know you and possibly invest in your vision and company . We are excited to be in business with you - email us at info@axialvc.com
Observations #50
A set of ideas and observations from a week’s worth of work analyzing businesses and technologies.
Talent and models: AI-driven life sciences companies
An important element of building an AI-focused life sciences company is recruiting and training engineering talent. Recruiting AI talent is a lot harder than you would expect. Then retaining them is even harder. In general software engineers are being paid like professional athletes especially those with specialized knowledge in artificial intelligence.
Talent may be the main bottleneck limiting the impact of AI on life sciences. There are 100Ks of AI engineers in the world, but the competition for them is across almost every industry. For a life sciences company, there are 3 main challenges for AI talent: (1) Getting the expertise in the door, (2) Helping engineers learn enough biology to be dangerous, and (3), Retaining the trained engineering talent given the fierce competition against the Googles and Facebooks of the world along with every quant hedge fund. Given this, what are some strategies to build talented engineering teams?:
Get them early - a co-founder or early employee has to be a world-class AI engineer to improve the odds of attracting more. It’s pretty hard, but not impossible, for a founding team of biologists to recruit great engineers.
Get an advisor - the second best option to attract talent early, is to get an advisor who has expertise in AI. In short, there is a need for more Jeff Dean’s of biology.
Poach entire teams - engineers in large technology companies might be open to moving into life sciences depending on their salary/equity packages along with the potential to work with co-workers. Hexagon Bio did the best job here recruiting a lot of great engineers from Palantir.
Train biologists to become data scientists and SWEs - the last resort is to train biologists to become engineers. This might take 3-5 years though so be patient.
The last challenge for an AI-focused life sciences company is implementing biologically relevant models. Data generation and talent are prerequisites for AI models but understanding when to deploy them is another important challenge. Biology is still the limiting factor and its complexity is what makes it beautiful but hard to engineer. Given this bottleneck, having a deep understanding of deployment phases and other features before translation is incredibly important. So what are some of the main problems for implementing models in biology?:
Unstructured and noisy data - collecting data from sequencing to spatial transcriptomics to health records even can create too much complexity for a given model and team to handle. Just as important as any given data source are cross-validation studies.
Training data coverage - making sure the initial data set and training period is robust enough to deploy into the lab. Depending on the problem address, this process can take weeks and even months.
When to deploy? - figuring out when a model is accurate enough to trust its recommendations. For example, two key metrics in classification models are ROC and AUC, which measure performance of the model at all classification thresholds and the aggregate performance across all thresholds respectively. In diagnostics, determining when an AUC is high enough to distinguish a patient with a disease or not and deploy is tricky without validation data.
Time to deploy? - decreasing the time it takes from developing a model to deploying one. It’s not obvious this gets easier over time given the increasing number of edge cases with a larger data set. However, this is where in-house data generation capabilities become an advantage to manage the cost of wet lab work and hopefully make deployment less expensive.
During the process of model design, early positive signals can give a false sense of confidence. Initial hits may not translate well at a certain stage of development. An early discovery may not account for a unique edge case. Simply a MoA that works in models may not translate into the clinic. Or for something in biomanufacturing, a model that is accurate at one scale may not work at another due to difference in oxygen circulation among other external factors. The long-tail is where the majority of the work is done to implement AI models. Biology is a large search space and can easily multiply the number of edge cases thereby increasing costs of development. AI models that take advantage of parallelization to test models, manage data inputs, and work to eliminate steps in the product development process have a shot to reduce this search space. Ultimately these edge cases may never disappear given the complexity of biology and will need to be validated at the bench or the clinic.
The rules of building an AI-first life sciences business are still being created and written by companies like Insitro, Recursion Pharmaceuticals, and more. Three important moats are talent, scale, and product development - can a company recruit/retain the best AI talent, validate their models more efficiently, and ask the most important questions? But AI and data itself likely do not create moats in the long-run. Both become commodities - AI models be commoditized especially with pre-trained version and various open source libraries, and data can be put in the public domain and proprietary datasets can soon become commoditized as their generation become cheaper (i.e. $10M spent can sequencing today can soon only cost $1M in a few years). There 5 key lessons in building AI-first life sciences companies:
Focus on specific problems
Narrowing focus to a specific task in life sciences is pretty important given how complex the whole field is. Ideally, low-hanging fruit tasks are pursued first to validate the models and build momentum on the product development side. Examples are BigHat and LabGenius working antibodies, and even AbCellera has built out their AI team. Unnatural Products and Anagenex for certain types of small molecules. Serotiny for CARs. Dyno Therapeutics for AAVs. Asimov for certain mammalian cells. All of these companies focus on specific problems and use AI to scale low-complexity tasks.
This enables reducing complexity of models and data
By narrowing down focus, a company can reduce the complexity of its models and data generation capabilities. Dyno can get really good at capsid engineering rather than having to build models for everything ranging from transgene expression to engulfment. Serotiny can build product moats around chimeric antigen receptor constructs without having to necessarily create models for target engagement. This reduction in complexity has a direct impact on COGS.
Recognize the high variable costs of building a life sciences business centered around AI
An AI-focused life sciences company not only has costs associated with wet lab work but costs from model implementation and compute. This is what makes these business models unique: they make an upfront investment in software and AI with the premise that more products or partnerships will come. More often than not platforms are overbuilt to prepare for new applications. Hopefully these variable costs decrease as more infrastructure is put into place. However, AI will have an impact on gross margins initially driven by model maintenance and deployment; it might be useful for companies to split these variable costs from wet lab work.
Commoditize or be commoditized
New AI tools are still being built out. We are still pretty early in this wave of AI progress with ImageNet coming out in 2012 and TensorFlow and AlphaGo in 2015. Beyond fundamental breakthroughs, new workflows for AI models are emerging along with new models and automation of training tasks. Given all of this activity, an AI-focused life sciences company can easily fall behind on technology. It’s important to be forward thinking and constantly update your toolkit.
Use AI to build moats around products
The model itself is not a moat nor is the data. AI often allows problems in biology to be solved in a new way. Examples are old problems like drug repurposing or AAV and CAR design that needed scale to unlock new therapeutic variants. AI can help design unique products where a moat can be constructed.
Ultimately, AI companies in life sciences are more bio than software. Some of the advantages of AI don’t come cheaply. Companies face challenges in data generation, building interdisciplinary teams, and variable costs of developing AI models. These upfront costs create barriers from new entrants and create some defensibility. The ability to build hybrid business models merging life sciences and AI have the potential to invent faster, create new markets, and make product development more efficient. AI may lead to non-obvious side effects; for example, in drug development, more data might make it easier to find more ways to argue for an approval. There are opportunities to bring AI to metabolic engineering, biomanufacturing, enzymes, and ASOs. Beyond drug development, companies like Ginkgo and Zymergen have taken the lead on using AI in synthetic biology. Inflammatix and Endpoint Health in diagnostics. Levels and Whoop in consumer health products. Abridge and PatientPing in healthcare. Artificial intelligence has the potential to create a template for new products in all of life sciences. But at the very least, AI allows a business to generate new IP and execute more narrow partnerships. For any AI-focused life sciences company, talent may be the most important moat; however, hypothesis-generation and scale are valuable as well. The enduring businesses will focus on problems that are easy for AI models avoiding edge cases and combining their platform with other business strategies.
Rare diseases in China and India
As other countries’, mainly China and India, healthcare systems mature and their drug development ecosystems grow, the impact of this growth will reverberate across all rare diseases. The combined population of China and India is over 8x higher than the US:
US population: ~328M
China’s population: ~1.4B
India’s population: ~1.36B
That’s nearly an order of magnitude more rare disease patients that can make trial recruitment a lot faster and effective advocacy for more patient communities. The growth of India and China make rare diseases a little less rare. Opportunities in global clinical trial recruitment will emerge along with patient-driven drug development and consumer genetics in general.