When OpenAI's GPT-3 natural language processing software burst on the scene in 2020, one of the most remarkable things about it was its ability to carry out a variety of tasks in a "zero-shot" fashion. Zero-shot fashion means without having been given any explicit examples of the task, such as printing the French word "rate" when a person types the phrase "translate the word spleen into French," despite never being trained explicitly to translate.
Also: What the New York Times and others are terribly getting wrong about ChatGPT
In the near future, AI programs may be able to develop new cancer drugs in a zero-shot fashion, inventing combinations of amino acids that bind to cancer cells and neutralize them with no prior example of an effective protein.
This past month, drug development firm Absci unveiled a paper containing novel antibodies against what's called "human epidermal growth factor receptor 2," or HER2, a human oncogene that has been linked to some forms of breast cancer.
The AI model had been fed no data on existing, successful antibodies against HER2, and no explicit information about how to successfully attach to -- or "bind" to -- HER2.
Also: Just how big is this generative AI? Think internet-level disruption
"We were able to design the CDR regions of the antibodies -- those regions that are actually involved in binding to the antigen's epitope -- and we were able to create those, even though we had removed all the data from the training set that was related to the target," said Joshua Meier, Absci's chief AI officer, in an interview with ZDNET.
The antibodies, moreover, were designed de novo, meaning the AI program built them from first principles, without being fed any examples of successful antibody designs.
"The whole point is it's de novo," said Meier. "You've never seen the antibody before, you're making this from scratch."
The paper also presents the first time that a wet lab has ever been used to test the binding affinity of antibodies made with AI. Absci, founded over a decade ago, controls a novel mechanism to engineer E. coli cells as factories for producing custom proteins. That ability lets the company mass-produce the antibodies designed by AI and test them in the wet lab.
Also: How to use ChatGPT: Everything you need to know
As the paper points out, prior work has been completely computer work. "Several groups have introduced models for generative antibody design with promising in silico evidence, however, no such method has demonstrated de novo antibody design with experimental validation."
Absci, a publicly listed corporation, has formed partnerships with multiple pharmaceutical firms, including Merck, to pursue novel therapies for a variety of indications.
"We're really excited because we will have a drug in the clinic in 2024 that will utilize this technology," said Absci founder and CEO Sean McClain, in the same interview with Meier. "It'll be the first de novo-designed antibody using zero shot generative AI, and I think that that's going to be a huge moment in the industry."
The paper, "Unlocking de novo antibody design with generative artificial intelligence," authored by Amir Shanehsazzadeh and colleagues at Absci, with Meier as corresponding author, was posted Jan. 9 on the bioRxiv pre-print server. The paper has not been peer-reviewed, and so its findings should be taken with a degree of caution.
Antibodies are proteins that interfere with, and ideally neutralize, an invading pathogen, known as the antigen. Aspects of the form of the protein match aspects of the shape of the antigen, so that the search for antibodies as cures to disease involves finding what is often likened to a lock and key combination.
Also: The people building artificial intelligence are the ones who need AI the most
The CDR, the complementarity-determining regions, are the part of the antibody that serve as the key. The epitope, like a lock, is the place on the antigen, the target, where the antibody binds.
A striking property of the work, said Meier, is that the AI program arrived at novel chains of amino acids that are unlike typical antibodies to HER2.
"The model comes up with sequences that are totally different from ones that we see in nature," said Meier.
For example, what's called the CDR3 region, a portion of the immune system's T cells, which does most of the binding, contains 13 amino acids. "The model was able to change 12 of those 13 amino acids and still have binding to the target," explained Meier. "It came up with a number of solutions like that."
The discovery is analogous to how Google's DeepMind unit's AlphaGo was able to arrive at winning strategies in chess and Go that are different from how human grandmasters pursue the game.
As Meier explains, "The model has figured out that there are certain interactions that are key to binding to the same epitope -- it seems the model has learned spatially to place these amino acids in the same place, even though those amino acids might be in different places in the sequence.
"Not only did [the model] change the kind of amino acids, but it made sure to point it in a way that it would still make the same interaction," said Meier. "It was just incredibly interesting that the model was able to figure this out, even though it has never seen in the training data anything interacting with that target before."
The novelty of the program means "it's additionally being able to explore a much larger search space, which is really, really exciting," said McClain.
In a paper last fall, Absci's AI scientists came up with a novel metric for created antibodies they dubbed "naturalness." Naturalness expresses in a numeric score how close a synthesized antibody is to the body's naturally occurring antibodies (known as immunoglobulins). Such naturalness should affect positively the ability of the antibody to function in practice in the body.
The new study, said Meier, devised antibodies that not only demonstrated high naturalness scores, but were also in some cases more natural than the score for Trastuzumab, a clinically approved therapeutic for breast cancer.
As an AI paper, the text might raise eyebrows because it does not disclose what the AI model is that was used to design the antibodies. The paper describes only the antibodies that resulted, and the wet lab procedure by which they were tested in the lab for binding affinity.
Also: Meta's little LLaMA model comes with big benefits for AI researchers
That omission was deliberate, McClain and Meier told ZDNET.
"That's the secret sauce," said Meier of the AI model, meaning that it is protected intellectual property of Absci.
That has caused speculation, noted Meier and McClain. "You see people who are tweeting about our work saying, Oh, this is like a language model," he said, referring to what are called large language model AI programs, such as GPT-3. "But we never wrote anywhere in the paper that there's a language model here," he said.
"I'm not going to say anything," continued Meier. "Maybe there's a language model-like part of this thing, but it's definitely not just a language model; it's not an off-the-shelf thing that you just train."
Meier and McClain indicated that the AI program is a generative AI program, which puts the program in the same grouping as GPT-3 and also OpenAI's follow-on ChatGPT program. But that also leaves wide room for any number of different kinds of programs.
Added Meier, "There [are] a bunch of things going on here, some of which are things that are out there, but that were synthesized in a novel way."
It's "kind of funny," said McClain, to see how various outsiders have formed conclusions about the AI model or models. "They're very definitive, 'Yeah, this is a language model,' and it's just funny because it's actually pretty novel breakthroughs," he said.
Meier offered to "give some hints" as to the nature of the program. It's definitely "generative" AI, he said. "It should be clear that you have to make something generative" in order to take in an antigen and produce sequences as an output.
In addition, "You probably need one of these more end-to-end kinds of systems," said Meier, meaning a program where the final output goal shapes every aspect of the various functions that combine in the AI model to solve the problem.
The bottom line for Meier is that if it were as easy as taking GPT-3 or another existing language model and applying it, "you might have seen other people already put out data here, and nobody has really put out data on this problem."
Also: What is generative AI and why is it so popular?
The bottom line for McClain is that "pharma doesn't care how you got the outcome."
"What they care about is the output, and the output is, the sequences, the diversity you get, and, ultimately, does the drug work or does it not?"
Although the machine learning approach is not disclosed, McClain emphasizes that the work does give sequence data for all the antibodies for reproducibility.
"We actually released all 400-plus antibodies that were designed to HER2, and disclosed the sequences, as well as the binding affinities, to really show the diversity that we got. There's like a reproducibility crisis in biology," observes Meier. "You put out a big claim like this, and people might be skeptical."
"If you don't believe us, go test it in the lab yourself!" added McClain. The sequence data are posted on the article's companion GitHub site.
Absci is pursuing a dual business track, developing some therapies with partners such as Merck, but also a second track where it will identify therapies on its own and then choose a development partner.
On the latter score, the company has a very good chance to use the de novo approach to bring its own therapies to clinical trials, according to Andreas Busch, Absci's chief innovation officer.
Also: AI's true goal may no longer be intelligence
"Based on everything we know, we have very, very high confidence that if we find an antibody which matches the target, that we will have something that will have a high efficacy in the disease, and zero side effects, which is very, very rare," said Busch in a separate interview via Zoom. "The only remaining question is can I deliver an antibody against it."
Without disclosing what indication, or condition, the company is going after on its own, Busch said the particular mechanism operating in the disease in question is of such a nature that it yields itself well to the de novo approach.
"My confidence comes from two sides: I really understand the mechanism, and there is no doubt it will work if you address it with the right molecule," said Busch. "And I do have confidence we will find it with the right molecule."
"This is a mechanism that is just one of those very few mechanisms where you know it will work and it has no side effects."
Of course, many things can still happen in phased clinical trials, conceded Busch. "We are very confident, but we haven't proven it," he said of the approach.
Busch's gut instinct is perhaps significant. He has lead R&D for several of the most prominent pharma firms, including Sanofi, Bayer, and Shire.
"You have to trust me: I brought ten compounds to the market," meaning, guiding ten novel therapies starting with chemistry and proceeding to FDA approval.
When the company is ready to proceed to clinical trials, it would do so, said Busch, with the help of what's called a CRO, a Clinical Research Organization, which knows the ins and outs of the procedures involved, including cohort selection for human trials.
Moving to development would involve bringing in a strategic partner to help Absci, likely a pharmaceuticals firm.
AI lead Meier sees the capabilities for protein synthesis leading to a broadening out of capabilities and applications in years to come.
Also: Artificial intelligence: 5 innovative applications that could change everything
"Once you have models that are working, you can start to go crazy with different applications, we can start to to really go crazy with our imagination of what kind of drugs we can make," he added.
Personalized medicine, a long held goal in industry, may be speeded along by such AI experimentations, he offered.
"Let's say you have a certain kind of cancer, and every patient has a different form of that cancer," explained Meier. "How about making a unique drug for each of those patients? You might go after cancers that you just couldn't really do clinical trials on before because a one drug-fits-all approach would never work."
"The potential is this ability to actually start to go after these un-druggable targets where you haven't been able to effectively generate an antibody" using conventional biological discovery technologies, such as immunization or phage display, said McClain.
In McClain's description, the future of drug discovery sounds something like a ChatGPT prompt.
"You can now throw that target into our model and then specify, I want to hit, you know, this region of the target that's going to give me the biology, and now you've just made that target druggable."
Such drug searches start to take on the quality of rapid code development. "You get it at a click of a button," said McClain, with the result that "you're seeing timelines from five and a half years go down to 18 to 24 months" for the average drug development pipeline.
"And this is what's fundamentally going to lower health care or drug prices."