Death and data science: How machine learning can improve end-of-life care

KenSci CTO Ankur Teredesai is exploring with machine learning and artificial intelligence dynamics in the healthcare system at the end of an individual's life.
Written by Larry Dignan, Contributor

Video: How KenSci uses machine learning and AI to predict end of life

KenSci, a company that has developed a machine learning risk prediction platform for healthcare, recently presented a paper on predicting end-of-life mortality and improving care.

The paper, which tackles a tricky topic with predictions for the last six to 12 months of life for patients, was accepted by the Association for the Advancement of Artificial Intelligence. At stake is $205 billion in cost spent on care for the last year of an individual's life. But it's not just about costs. Here's an excerpt from the paper Death vs. Data Science: Predicting End of Life.

The number of Americans using palliative care services continues to grow and was estimated at 1.7 million, or about 46% of those who die (NHPCO 2016). Yet these services are being utilized too late: the median length of stay in hospice care in 2016 was only 23 days. Additionally, 28% of hospice patients were discharged or died within 7 days of hospice enrollment (NHPCO 2016). In work by Christakis and colleagues, they suggest that hospice clinicians consider 80-90 days of hospice care as optimal for the needs of patients and their families (Christakis 1997). Surveys of family members of decedents indicate that satisfaction with end of life care is correlated with their perception of timeliness of hospice referral (Teno et al. 2007). Finally, providers that commonly encounter in-hospital patient death, like intensivists and critical care nurses, have high rates of professional burnout (Embriaco et al. 2007). It follows to conclude, therefore, that timely and appropriate end of life care impacts all aspects of the Quadruple Aim in healthcare (quality,satisfaction, cost savings, and provider satisfaction).

KenSci CTO Ankur Teredesai

As part of our ongoing series on data scientists and their approaches, we caught up with Ankur Teredesai, CTO of KenSci and one of the authors of the paper, which was recognized in the emerging technologies category.

What data sets did you use to model?

The challenge of predicting 6- to 12-month mortality risk is fairly complex. It's a $205 billion problem just in the US. At KenSci, we have a platform that is designed for scale and operational effectiveness of machine learning to solve societal problems such as these with such a large impact. In this particular setting, we had existing machine learning models for 6- to 12-month mortality prediction from prior efforts. We partnered with two major health systems in the Pacific Northwest and then re-trained our models and created additional ones with new data.

The data from Health System A came from a patient population with a history of heart failure (HF), and included 4,888 patients with a variety of electronic medical records data including:

  • Demographic features
  • Patient length of stay
  • Overall cost related features
  • Specific cost related features (in-patient, out-patient, home health, hospice, skilled nursing facility) readmissions information
  • Counts of procedures performed, tracked through the Healthcare Common Procedure Coding System (including things like ambulance rides, equipment and prosthetics)
  • The data from Health System B consists of patients with any type of illness and includes 48,365 patients. Only claims data was available for Health System B.

The paper has details on the data elements used for the modeling.

Read also: What is AI? Everything you need to know about Artificial Intelligence | Science-based healthcare: How IoT and AI can help us make health decisions based on data not opinion

How do big data techniques apply to your research? What dream data sets are missing from this effort?

We leverage the Microsoft Azure cloud for some of the underlying components. We also seamlessly integrate with existing enterprise big data investments to ensure that healthcare can benefit from a volume of data sources.

KenSci works with healthcare partners across the world on diverse data sets ranging from EMR (electronic medical record) to psychosocial to claims to billing and finance, enabling a longitudinal view of a patient and the overall hospital population. The system is cloud-based and therefore connects to new data sources as they become available.

In the case of assisting a physician in transitioning a patient to palliative care based on insights gained from a 6- to 12-month mortality prediction is a very complex endeavor. Data like demographics and co-morbidity provide good results but additional data sources such as physician input or variations in prescriptions can often provide signification additional information. At the end of the da,y there is never an ideal "dream" dataset in machine learning. EMR's tend to contain less than 10 percent of the information about a person. In an increasingly connected world we will continue to generate additional data assets that add to the complexity of data driven decisions. The advantage of machine learning is its ability to learn incrementally and improve with more data and feedback.

Read also: Also see: A day in the data science life: Salesforce's Dr. Shrestha Basu Mallick

How did you build the model and what was the role of human input in building it?

We built the model with assistive intelligence in mind. Every model that we develop at KenSci is built with the idea that human input will be a key factor at each step in providing care. The KenSci machine learning (ML) platform favors explainable ML models that can be interpreted for correctness and validation, and then physicians and clinicians at KenSci not only validate the outputs of the ML models, but also help decide the input features when it comes to clinical work flows before we integrate it into any tool. The entire process is very stringent, and we are always looking for ways to make it even more assistive and at the same time rigorous.


The topic is a sensitive one and there will naturally be fears about care being decided by an algorithm. What's the approach that's best for determining what course to take with end of life issues.

At KenSci, we seek to enhance the quality of patient outcomes by increasing hospital and caregiver efficiency. The use of AI algorithms to deliver insights into who might get sick, as well as how sick and when, and how patients can be served more effectively across the care continuum. While AI is still new to healthcare, its intelligence can be used by caregivers and hospital systems to become more efficient. The physician will always be the decision maker, and an algorithm should never come between the doctor-patient relationship. Generalized intelligence is a tool we need to use, but the decision point when it comes to healthcare and end of life lies with the doctor and patient.

At KenSci, we think of artificial intelligence as assistive intelligence -- i.e., it is meant to enable the experts who are using the technology and not replace them. This also applies to the topic of discussion here around end-of-life care transition issues. The models were designed to help where physicians may be missing attributes given the large number of variables ML can pay attention to in stratification so Artificial Intelligence can give additional knowledge to make a well-informed decision.

Read also: Can AI make your health insurance better?

Would this research be possible without EHRs and how do you handle the data that's still unstructured in the medical system (ie paper or worse)?

EHR data is necessary but not sufficient to generate deep insights and predictions from the healthcare domain. While unstructured data can add useful additional information for predictive models, even simple problems within healthcare across the continuum remain unsolved, because even structured data is not being used to its full capacity. Structured data offers enough richness to provide descriptive statistics and make sufficiently good predictive models for problems like risk of readmission, mortality prediction, emergency department utilization prediction, etc. However, EHR and other structured data have not been applied to its full potential toward this end.


How do you approach a topic like cost savings on the macro level when these predictions are inherently personal?

While end of life care is inherently personal, predicting high-cost patient cohorts and identifying patterns that lead to high cost and high utilization is vital for hospitals and health systems. KenSci's solution can help determine high-cost cohorts by analyzing longitudinal health records, predicting future high utilizers by modeling diseases, and predicting end of life to improve palliative care utilization.

However, a system like this can do more than simply provide predictions on end of life -- it can also allow providers to explore patient risk profiles and predict potential readmissions. While cost savings are obviously attractive to health systems, systems like these enable better patient care across the spectrum. In various instances, ML systems can help reduce physician burnout, assist with staffing, and indicate patients that may need medical interventions. The insights gained from ML systems can help caregivers have more informed and pre-emptive conversations with clients regarding their wishes for end-of-life care.

Video: How IoT and AI can help us make health decisions

Related stories

Editorial standards