Laptops over lab coats: Can AI really help a team of 12 disrupt the pharmaceutical industry?
Matt A.V. Chaban
Senior Editor, Transform
Dan Rowinski
Senior Writer, Transform
Meet Superluminal Medicine, part of the wave of start-ups using gen AI and high-performance computing to hunt for new protein compounds for drugs.
Drug discovery is an extraordinarily complicated field. For all of humanity’s accumulated knowledge and best intentions, we can still be surprised at the effects when a specific set of compounds reacts in the body.
For instance, a certain blue pill popular with men started as a treatment for hypertension, or a new and sought-after generation of drugs used for weight loss started as a treatment for Type-2 diabetes. Drug discovery consists of so many variables that it is a field begging for the era of modern technology to help build new business models, make processes faster, and predict possible outcomes.
This complexity, and the opportunities that come with it, help explain why drug discovery, as much as any other application, has so excited proponents of generative AI. The field has been set abuzz by the possibilities to not only speed up research and testing of life-changing medicines but also to explore novel compounds and combinations that scientists might never have discovered on their own.
Massachusetts-based Superluminal Medicines is among those pioneering this medical frontier, combining deep subject matter expertise with cloud computing and artificial intelligence. Superluminal means “having a speed greater than that of light,” and the idea of supercharging drug discovery is the company’s guiding mission. With a small team and a robust seed round of financing, Superluminal is ready to take on the drug discovery challenge.
“We look at particular problems where technology can give biology an assist,” Cony D’Cruz, CEO of Superluminal, said in a series of interviews with Transform. “Our job is to develop drugs really fast using whatever technology is fit for purpose.”
While Superluminal’s platform is applicable to any protein targets, the company decided to make its first set of discovery programs for drug targets known as G-coupled protein receptors (GCPRs). These are the same class of drug targets of popular weight loss drugs like Ozempic and Wegovy.
Superluminal only has a dozen employees, working out of a small office — one that looks like any other start-up and is devoid of lab space — in the Boston suburb of Waltham. Yet thanks to cloud computing, the team can do pretty much all its work from anywhere. This includes their primary focus, tapping into multiple machine learning and generative AI methods to understand the protein receptors of a cell, then to model potential compounds, and to determine potential side effects.
About one-third of all approved drugs target 130 different GCPRs. But there are more than 850 GCPRs in the body, so Superluminal believes that with the speed that the cloud and AI provides, it has a significant runway in developing models that target these important proteins with small-molecule drug discovery.
Transform sat down with D’Cruz to talk about the company’s mission and methods, how it uses the cloud and artificial intelligence as a small life-sciences company, and the unique opportunity its combination of technology and industry expertise presents for drug discovery.
Modeling proteins and drug interactions
What is unique or new about Superluminal’s approach to drug discovery? What are you doing that no one else has, and what role is cloud and AI playing in making that possible?
Cony D’Cruz: I like to think of it as the difference between looking at a picture and watching a movie in 3-D. Proteins in the body are continuously in motion, and that’s the way we want to conceive of them and study them. However, when most people try to drug a particular protein, they tend to look at snapshots, and they treat their solution like a lock-and-key, testing each “key” they’ve created in the lab to see if it will fit or not. That's not very productive.
Because proteins are continuously moving, you need to be able to understand that full motion, which we can do, thanks to working with Google Cloud, because it requires a lot of compute power. We have the ability not only to look at individual structures but to stitch them together to get a full-motion view. So we’re looking at these movies rather than stills. That's really important because it gives you a more accurate picture of how the protein is behaving in a cell and the body and therefore you can interfere with it appropriately.
Once you have your protein “movies,” what are you doing with them?
We use that knowledge to screen really vast libraries of chemical compounds to trawl through all the predictive molecules that you could possibly make today. That space is something like 7.8 trillion compounds. If you are trying to do that in a lab, it's going to take you a long time. It will take you four years to work through something like 40 billion compounds. We do that in a day. It's a huge step up in scale.
We also use generative AI throughout the process. We have the ability, once we identify an interesting hit, to use machine learning and predictive algorithms to generate novel chemistry that no human has probably seen. The machine takes what we've given it in terms of the model and develops its own novel chemical based on those inputs, right? It takes us into a whole new chemical space, very, very quickly.
Fewer side effects from AI
How are you training your AI models to be able to understand the potential of these compounds, and where does this fall in the more typical drug discovery process?
How a drug interacts in the body is typically defined by ADME — it’s pronounced “add me,” and it stands for “absorption, distribution, metabolism, and excretion.” Essentially ADME is the process when a drug chemical enters the bloodstream, gets distributed around the body and to its target protein and broken down, metabolized, and then excreted. We need to understand each of these interactions, because if any one of these things doesn't happen, it's not going to have the desired therapeutic effect.
However, most drug programs fail because of unintended ADME consequences rather than the required effect of the drug compound. For example, the drug does not get absorbed into the bloodstream, or it’s metabolized too quickly, or the reverse, the compound builds up in the bloodstream and causes toxicity.
What we’ve done is, essentially build a predictive ADME engine — we call it ADME-Drive — trained on large proprietary data sets. It’s a collection of models that enables you to determine how that drug is going to work in a human body, without it ever having entered the body, and gives you a line of sight to possible ADME liabilities and issues. We believe our ADME-Drive is going to give you a better idea of how that drug works in the body.
Are there any other benefits to using AI models in this way?
There is, and it has nothing to do with the drugs themselves. Roughly 94% of drugs that pass animal tests fail in human clinical trials. That’s a lot of excess testing on animals to gather data, with no better options — until now. Researchers have cured a lot of mice and rats from Alzheimer's, but we haven't done the same for humans. That's because those models aren't transferable. The predictive models that I've been talking about are human-based.
We’ll still need to do testing to get a better sense of efficacy, but we’ll need to do a lot less testing because we’ll be closer to a solution from the start. So we're using far fewer animals and getting the same data. Generative AI gives us the ability to train these models using our large proprietary data sets, and then using the models to confirm those interactions before we have to move on to clinical testing.
Discovering new drugs with help from AI and the cloud
You’re using AI throughout multiple aspects of your research methods. First to help model the “movie” that is the protein target, then also in the ADME process. What about in the modeling of new compounds, is AI playing a role there?
The first use case that attracted our attention was what DeepMind kicked off with AlphaFold2. We expanded on AlphaFold2, to go from sequence to structure to multiple conformations, or shapes. These models are addressable, publicly available foundation models which we've enhanced based on our expertise and the applications we need.
The second use of AI is what we call de novo design. That means allowing the machine to actually design compounds that are fit-for-purpose. We feed it very simple terms as the starting blocks, and then we let the artificial intelligence create new and never-before-seen molecules that retain the initial function of the molecule that we programmed into it at the start. That's generating for us novel chemical matter that we're actually taking forward after rigorous testing.
Now we’ve talked a lot about research, but one of the most surprising things about your work may be how you actually do it — it’s not thousands of scientists in lab coats, it’s about a dozen of you at laptops, right?
We've been able to move really quickly because of the back-end infrastructure a cloud provider like Google Cloud offers. We can focus on driving the research and application side and not worry about the back end. It’s like when you get into your car you don't think oh, I've got to make sure that the spark plugs are working. It just happens, right? And you get in your car and you go where you want to go.
That's what we want. We wouldn't be able to do what we do without cloud enablement. As a small company, to be able to scale the way we do, it’s only possible because of cloud compute and the prebuilt servers and models that get our research so much closer to production.
I’ve seen small companies that had on-prem supercomputers or high-performance computers of their own, they raised a lot of money and then they blew through it just setting up their IT. And it’s not just that they were trying to support what they were doing on-prem, but they also didn't have any flexibility. So they couldn't scale up or scale down.
When you dedicate that kind of spending to an on-prem high-performance computer, obviously it's going to eat up your resources. What we do is burst to the cloud when we need to. We don't have many internal administration costs. Our ability to scale up or reduce as required is totally in our control. We have access to virtually unlimited compute resources so that we have a rapid turnaround time for our computer-based experiments.
What does that look like in the usual course of research?
We have certain applications that we need to run. So for example, molecular simulations, for understanding how a protein binds to its drug target. We need to use the Google Cloud back-end for those simulations to happen.
So we simulate, on the computer, how a protein binds to a chemical compound. It takes around 20 sec for each compound. Once we’ve collected data on enough compounds, we can train machine learning models. Our models then work with combinations of multiple protein conformations and billions of compounds. Ultimately, this allows us to perform our calculations in a matter of days using the cloud. Using more traditional methods, this work can take years.
Beyond the computing power of using Google Cloud, there is the fact that the cloud is totally scalable. So when we find something that we need to do, we can scale dramatically very, very quickly and then bring it down appropriately. Our time and expenditures are directly tied to when we need to test something, with no IT overhead beyond that.
If we didn't have cloud compute, it would be really challenging to do what we do, especially for a team of our size — it’s just 12 of us — working as fast as we are.
Opening image created with MidJourney, running on Google Cloud, using the prompt: a collection of cellular proteins illustrated in a flat, colorful style for the cover of a smart business magazine.