A path to personalization: Using ML to subtype patients receiving digital mental health interventions

Guest post by Danielle Belgrave, Principal Research Manager; Anja Thieme, Senior Researcher. First published in microsoft.com

Mental health experiences vary widely from individual to individual. Because of this, effective treatment is about identifying the unique set of tools that will help a person manage their mental health productively. And that rarely happens overnight.

People experiencing symptoms of depression, anxiety, or other mental health conditions know the therapeutic process can be a long and arduous one. But as healthcare becomes digital, we see an opportunity to help people key in on the treatments that might work for them sooner, eliminating some of the false hits along the way, through subtyping. Subtyping is already happening around us when we receive suggestions, for example, for reads or programs we might enjoy after purchasing a book online or streaming a show. The systems at work use past behavior to make these recommendations, creating a crude heuristic by which they could assume people who responded favorably to one thing will respond favorably to something similar.

In their relatively young existence, internet-delivered mental health interventions have not only been shown to provide effective treatment at lower costs, but have also produced data from which to learn similar patterns of engagement. And while studies have shown how more engagement with the treatment leads to better outcomes, there’s an awareness that the types of engagement—check-ins with a clinician versus more self-directed work, let’s say—make a difference. Yet, how these different forms of engagement with treatment affect mental health outcomes is less well-known.

Through the continuing Project Talia research collaboration between Microsoft Research Cambridge, Trinity College Dublin, and SilverCloud Health, we’ve developed a proof of concept for identifying more nuanced patterns of engagement. SilverCloud Health, the world’s largest provider in digital mental health, offers a suite of internet-delivered cognitive behavioral therapy (iCBT) interventions for the treatment of depression, anxiety, and other mental health conditions. Leveraging anonymized long-term engagement data and clinical measures from 54,604 SilverCloud Health patients—the largest dataset of its kind—we built a machine learning framework that identifies subtypes of patients based on their use of the iCBT intervention over time and investigates how these subtypes predict clinical outcomes for patients with symptoms of both depression and anxiety.

Our work, which was published in a JAMA Open article titled “A Machine Learning Approach to Understanding Patterns of Engagement with Internet-Delivered Mental Health Interventions,” paves the way for better understanding that if people engage with treatment differently and get different outcomes how we can optimize and personalize treatment so they receive the best results possible.


Figure 1: Researchers’ hidden Markov model represents how observed engagement (Y) with different sections (s) on the SilverCloud platform transitions over time (t) and whether these engagements with different sections are representative of a latent subtype of engagement (K) that can be learned from the observed data. Here, x is a hidden state representing true engagement each week; π is uniform Dirichlet prior probability, which assumes each patient has equal probability of belonging to a particular state; abr is the probability of a patient moving across states over time; and qs and q's are probabilities of state membership.

Identifying subtypes using probabilistic graphical models

To understand whether different types, or patterns, of patient behaviors exist in the way people engage with an iCBT intervention for depression and anxiety, we employed an unsupervised machine learning approach using probabilistic graphical modeling. Probabilistic graphical models provide an especially rich framework in which to represent complex data structures, such as those found in healthcare, where we’re trying to understand disease progression based on many different symptoms and factors. There are several examples of the use of probabilistic graphical models to understand disease heterogeneity in a longitudinal, observational, clinical context, such as sepsis, asthma, and kidney failure. Here, we’re looking at a variety of different patient actions over an extended period of time across multiple patients to try to identify subtypes. For 54,604 SilverCloud Health patients, that’s more than 3 million data points.

The probabilistic graphical modeling we used is the hidden Markov model (Figure 1), which assumes behavior today is conditioned by behavior yesterday. So whatever action we observe someone doing on one day is conditioned by the action on the previous day. From this model, we want to learn the probability of moving from one state to another: If a patient is engaging with a specific tool today and yesterday they didn’t, what’s the probability they’ll engage with it on a particular day moving forward? We assume the probabilistic patterns of behavior are governed by something that’s not directly observed, but rather is latent and can be inferred from the transition, or change in a person’s behavior from one day to the next, over time. The model seeks to capture what we can’t measure directly with only raw metrics like resource usage—how engaged a person is.

Our model represents how observed engagement (Y) with different sections (s) on the SilverCloud platform transitions over time (t) and whether these engagements with different sections are representative of a latent subtype of engagement (K) that we can learn from the observed data. We define a hidden state (x) that represents true engagement each week. We assume a uniform initialization probability (π)—more specifically, a uniform Dirichlet prior probability—and learn the probability of patients transitioning across states over time (abr). These transition probabilities are governed by the overarching latent subtype (K). We don’t know the number or size of subtypes in K, but we learn the optimal number using penalized log-likelihood based on Bayesian information criterion, which identifies the most parsimonious number of subtypes that best describes the data, ranging K between 1 and 10.

Prior to our work, SilverCloud Health patients had given permission for their anonymized data to be used for studies like ours. Omitted from the data were not only names but also age, gender, and other demographics, and the content of information they provided as part of treatment. The data included only what they did on the program—sections they clicked, amount of time spent on particular sections, whether they had conversations with support staff, and the like.

ML Framework

Figure 2: The ML framework developed by researchers to better understand patient engagement with internet-delivered cognitive behavioral therapy identified five subtypes, or classes, of engagement. The above illustrates the level of engagement over time per subtype. Over 14 weeks, Class 1 showed a steady decrease in engagement, while Class 3 engaged often to start before experiencing the steepest decline in engagement. Class 5 maintained the highest level of engagement. Of the 54,604 patients, a small number had equal probability of being assigned to two different subtypes and weren’t counted toward the totals of either.

Identified subtypes and their predicted clinical outcomes

Using our probabilistic model, we identified five subtypes of engagement (Figure 2), inferred based on patterns in how patients interacted with different program sections over 14 weeks (of the 54,604 patients, a small number had equal probability of being assigned to two different subtypes and weren’t counted toward the totals of either):

  • Class 1 (“low engagers”; n = 19,930; 36.5%)
  • Class 2 (“late engagers”; n = 11,674; 21.4%)
  • Class 3 (“high engagers with rapid disengagement”; n = 13,936; 25.5%)
  • Class 4 (“high engagers with moderate decrease”; n = 3,258; 6.0%)
  • Class 5 (“highest engagers”; n = 5,799; 10.6%)

With the end goal of delivering more effective, tailored treatment, we investigated whether these prototypical patient behaviors were associated with improvements in depression and anxiety, as assessed through patients’ regular completion of standardized clinical questionnaires for symptoms of depression (PHQ-9) and symptoms of anxiety (GAD-7).

We found that these distinct subtypes had marked differences in PHQ-9 and GAD-7. Patients in Class 3, although they spent less time on the platform than Classes 4 and 5, had significantly greater weekly change in PHQ-9 over time. Class 2 had the least severe symptoms, indicated by the lowest mean initial PHQ-9, and saw the least amount of improvement despite the fact their initial mean PHQ-9 was not significantly different from the initial mean of Class 5. Trends were similar when it came to differences in anxiety symptoms. After 14 weeks, estimated gains in GAD-7 were greatest among Class 3 patients and lowest among Class 2 patients. The promising news is all subtypes—including low engagers—experienced positive results, as measured by PHQ-9 and GAD-7, suggesting any type of engagement can be effective in reducing symptoms of depression and anxiety. Those patients belonging to groups who engaged more, though, did see better outcomes.

Further investigation revealed the specific usage patterns that contributed to the outcomes observed above. While Class 5 opted for such resources as the relaxation and mindfulness tools, Class 4 used goal-based activities and mood tracking more and a lot of the core CBT treatment contents. Patients in Class 3 were not only more inclined to complete components of the core contents, but they also did so in their first few weeks.

Implications for achieving more personalized mental health interventions

We believe this approach of understanding and predicting future interactions with iCBT can lead to earlier intervention strategies and consequently better clinical outcomes. This approach enables us to scope beyond an average treatment effect. It helps us move toward care that responds to each patient’s needs and desired outcomes and ultimately individualized delivery strategies that can maximize engagement, minimize dropout, and improve patients’ clinical scores. By identifying different subtypes of patients early on, for example, we may be able to recommend at the outset those resources that have been found to lead to better engagement and improve symptoms more effectively.

The sensitive and fluctuating nature of mental health symptomology requires an increase in access to interventions that can be personalized in this way. We see online interventions as a means to achieve that goal. Further research and innovation and the continued identification of meaningful ways for machine learning to have a positive impact on mental well-being are necessary. We hope that through our partnership with SilverCloud Health we can leverage advances in AI to help people get the specific help they need sooner.


Download the research