IBM gives cancer-killing drug AI project to the open source community

If we understand more about cancer on the molecular level, we can learn to treat it more effectively.

Quantum computing leaves the lab Tonya Hall and Dr. Robert Sutor, vice president of IBM quantum computing, strategy, and ecosystem at IBM Research, to learn more about how quantum computing is changing and becoming more accessible.

IBM has released three artificial intelligence (AI) projects tailored to take on the challenge of curing cancer to the open-source community.

At the 18th European Conference on Computational Biology (ECCB) and the 27th Conference on Intelligent Systems for Molecular Biology (ISMB), which will be held in Switzerland later this month, the tech giant will dive into how each of the projects can advance our understanding of cancers and their treatment. 

Cancer alone is estimated to have caused 9.6 million deaths in 2018, with an estimated 18 million new cases reported in the same year. 

Predisposition through genetics, environmental factors including pollution, smoking, and diet are all considered factors in how likely someone is to develop such a disease, and while we can treat many forms, we still have much to learn.

Researchers from IBM's Computational Systems Biology group in Zurich are working on AI and machine learning (ML) approaches to "help to accelerate our understanding of the leading drivers and molecular mechanisms of these complex diseases," as well as methods to improve our knowledge of tumor composition. 

"Our goal is to deepen our understanding of cancer to equip industries and academia with the knowledge that could potentially one day help fuel new treatments and therapies," IBM says. 

The first project, dubbed PaccMann -- not to be confused with the popular Pac-Man computer game -- is described as the "Prediction of anticancer compound sensitivity with Multi-modal attention-based neural networks."

It can take millions of dollars to develop a single drug to tackle cancer and these financial restraints can delay or scupper our potential to develop new drugs and therapies. 

IBM is working on the PaccMann algorithm to automatically analyze chemical compounds and predict which are the most likely to fight cancer strains, which could potentially streamline this process.

The ML algorithm exploits data on gene expression as well as the molecular structures of chemical compounds. IBM says that by identifying potential anti-cancer compounds earlier, this can cut the costs associated with drug development. 

CNET: Intel packs 8 million digital neurons onto its brain-like computer

The second project is called "Interaction Network infErence from vectoR representATions of words," otherwise known as INtERAcT. This tool is a particularly interesting one given its automatic extraction of data from valuable scientific papers related to our understanding of cancer.

With roughly 17,000 papers published every year in the field of cancer research, it can be difficult -- if not impossible -- for researchers to keep up with every small step we make in our understanding. 

See also: Enterprise organizations plan to double AI deployments by 2020

INtERAcT aims to make the academic side of research less of a burden by automatically extracting information from these papers. At the moment, the tool is being tested on extracting data related to protein-protein interactions -- an area of study which has been marked as a potential cause of the disruption of biological processes in diseases including cancer. 

"A particular strength of INtERAcT is its capability to infer interactions in the context of a specific disease," IBM says. "The comparison with the normal interactions in healthy tissue may potentially help to obtain insight into the disease mechanisms."

The third and final project is "pathway-induced multiple kernel learning," or PIMKL. This algorithm utilizes datasets describing what we currently know when it comes to molecular interactions in order to predict the progression of cancer and potential relapses in patients. 

TechRepublic: Why companies plan to double AI projects in the next year

PIMKL uses what is known as multiple kernel learning to identify molecular pathways crucial for categorizing patients, giving healthcare professionals an opportunity to individualize and tailor treatment plans. 

PaccMann and INtERAcT's code has been released and are available on the projects' websites. PIMKL has been deployed on the IBM Cloud and the source code has also been released. 

Each project is open-source and has now been made available in the public domain. IBM hopes that by making the source code available to other researchers and academics, their potential impact can be maximized by the scientific community. 

Previous and related coverage


Have a tip? Get in touch securely via WhatsApp | Signal at +447713 025 499, or over at Keybase: charlie0