Think of your immune response as a giant machine-learning problem, with your body as the computer.
Immune cells travel around your body, sampling all sorts of matter they come into contact with, from your own cells to the cells of organisms that definitely shouldn't be there. If immune cells encounter something they know shouldn't part of your body -- bacteria or a virus, say -- the body scales up the cells that know how to deal with that interloper.
If there's a cell that's seen the intruder before and knows how to tackle it, your body rapidly reproduces it thousands of times -- enough that it can overwhelm the bacteria or virus before it has time to make its home in your body. And once the invader is routed, the immune system reduces the number of those cells again, keeping just enough in reserve that -- should the bacteria try a repeat assault -- there are enough immune foot soldiers to rout them once again.
Earlier this year, Microsoft announced a deal with Adaptive Biotechnologies, a health-tech and gene-sequencing company based out of Seattle. Adaptive Biotechnologies' gene sequencers are currently used to detect residual myeloma -- that is, cells that show a person who's been treated for blood cancer isn't entirely free of the disease.
Now, the company is thinking beyond just tracking down a single disease; it's aiming to identify anything that could throw your immune system out of whack, from infections to cancer -- and it's relying on Microsoft's machine-learning capabilities to help it get there.
The human immune system works on multiples large enough to make your head spin. There are two billion lymphocytes in the body, among them what's known as 'helper' T cells, others as 'cytotoxic' or 'killer' T cells.
Each T cell can recognise the antigens -- the triggers that will set off the immune system -- that are the signatures of bacteria, viruses, fungi or other invaders that have entered the body. Each T cell can bind to hundreds of different antigens, each potentially unique to a different bacteria or virus.
Once a T cell has got a hit, depending on what type of T cell it is, it may kill the invader, or signal the millions of other immune cells to come and take on the wrongdoer too. Anyone taking a snapshot of the immune system when the T cells are activated, by noting which T cell receptors are activated and which antigens they bind to, could work out which disease has taken over the body. And, once the disease is known, doctors can see more clearly how it can be treated.
Adaptive Biotechnologies started in 2009, set up to read and scan the immune system and the receptors on immune cells. Over time, the company began not only tracking immune receptors, but working out the link between the receptors and the antigens they bind to. By working out the binding relationships, the company started making steps towards being able to diagnose particular diseases from the immune receptors.
But then, according to CEO and co-founder Chad Robins, the company had realised that "we needed really sophisticated machine learning and computational power to really crack the problem -- this is a massive problem on the order of web scale".
Peter Lee, Microsoft's corporate VP of artificial intelligence and research, notes that each human genome is around 200GB: "And that's just for genome data -- for the metadata that would be extracted there and also the other sources of data from imaging, from wearables, from longtitudinal patient health records that one would want to correlate with the genomic data at population scale, is enormous. The information content is way beyond human comprehension, so the need for artificial intelligence and for data analysis really becomes fundamental."
A single blood sample will typically extract about a million T cells. Each of those T cells has a receptor that is genetically programmed to bind to specific antigens. "Being able to translate the readout of those T cell receptors' DNA sequences to a set of antigens, and then do the perfect translation of those antigens, to disease states is also a very, very large machine-learning problem," Lee added.
And that's where Microsoft's machine learning comes in. Microsoft is using algorithms that have been adapted from the ones the company currently uses for natural-language translation. "There's some similarity to what we do with the Bing search engine that's called topic identification," Lee said. Microsoft uses Adaptive Biotechnologies' MIRA system to generate training data -- training data that's used to create a 'translation map' from T cell receptors to antigen, and then map those antigens back to diseases as accurately as possible.
If this all sounds a bit abstract, the practice could have concrete benefit: if the mapping works as Adaptive and Microsoft hope it should, it could mean that patients could be diagnosed with diseases before they even know they're sick. For example, the symptoms of ovarian cancer are so insidious, it's often not detected until it's at a late stage, when it carries a poor prognosis. By pre-emptively testing people with genetic mutations, such as BRCA1, that put them at greater risk of ovarian cancer, the test could pick up the tell-tale immune signals that indicate early cancer. The earlier you catch the disease, the better chance there is of treating it successfully.
Adaptive is now putting its efforts into working on just two diseases "with unmet medical needs, either they're very, very hard to diagnose and/or the diagnosis allows a treatment intervention that can significantly impact the patient's care," said Adaptive's Robins.
Adaptive has a couple of diseases in mind to target first, and once it's proved the model, it's hoping to keep 'layering up' more and more conditions using the same system. "If we can really get to a diagnostic outcome of those, we would then proceed to the next two and the next five and the next 20 and so on for the subsequent years," Microsoft's Lee says.
Once one disease has been cracked, does it become easier or faster for the machine learning to crack the next one?
"Let me give you an optimistic and a pessimistic answer," say Lee. "Optimistic is that the deep neural nets in the hidden layers innately learn some hidden structure about the way the immune system works and then at some point after half a dozen or 60 or 100 or however many diseases, you just get this explosion in capability." At some point, potentially the neural nets will just be able to understand and decode each new disease without any retraining.
There is, of course, a pessimistic view. "You hit some sort of brick wall. At some point, the value of the additional training data and the value of the additional computing power starts to trail off. We sometimes see this in areas like machine translation: a couple of months ago we announced we had achieved human parity in translating English to Mandarin. We got 90 percent of the way there, but to get the last 10 percent to achieve human parity, we needed twice as much computing power and twice as much data... We really at this point don't know what situation we're in when we're trying to map T cell receptor sequences to antigens to disease states. We're hoping it's the former, but it might be the latter, or some sort of combination of the two."
While no one knows yet if the pessimistic or optimistic view will turn out to be correct, Adaptive is expecting its first single-disease diagnostic test to be available within a three-year timeframe, with a more comprehensive multi-disease screening test in around eight to ten years.
"As we start layering [each single-disease test] on top, one after another after another, at a certain point, the cost effectiveness, ease of use, and efficiency of doing it all together at one time will just make enough sense. It will become a screen of a biological system and that's what we'll be driving toward," says Adaptive's Robins.
Much as you might expect to go to your doctor for a regular medical or be asked to breast or bowel cancer screening when you hit milestone birthdays, in future, you might just be asked to provide a single blood draw that will be analysed and tell you the diseases to watch out for, or even tell you to get treated for conditions you never even suspected you had.
There is even the chance that the system may even diagnose you with a condition that is a one in a billion condition, or even one that's entirely new to medicine, says Microsoft's Lee.
"It seems likely we will detect conditions in people that haven't been understood or seen very often, or even ever before. The question of the value of those observations to medical research and the advance of science, is something we wonder about. That's another reason we're motivated to have some sort of open explorer for the data that we start to generate so that that might support scientific discovery."