Patients with ovarian cancer tend to underreport their most challenging symptoms; however, an analysis of their online comments could give health care providers insight into how to help.
A machine learning approach utilizing social media comments helped to identify patient and caregiver needs among those affected by an ovarian cancer diagnosis, which in turn, could help to create improved interventions, according to study findings presented at the Oncology Nursing Society’s 44th Annual Congress.
With this, the researchers noted more studies of this nature should be conducted to help. “Social media has gained an attention as a source to learn the perspectives, values and needs of patients and caregivers in naturalistic settings,” they wrote. “A thorough understanding of their concerns and needs is the first step to develop interventions for the target population. Language written in the social media can be cues of needs, however, manually identifying those cues is time-consuming and labor-intensive.”
Therefore, in the first study of its kind, researchers from the University of Pittsburgh and the University of British Columbia used a machine learning approach to analyze the language of this population on social media as a means of understanding their concerns so that better interventions can be developed for them and research can be focused on their greatest needs.
The approach aimed to supplement survey questionnaires and interviews as ways to gather this information, said lead author Young Ji Lee, Ph.D., M.S., RN, assistant professor in the School of Nursing at the University of Pittsburgh School of Medicine. She called the method especially relevant at a time when patient-generated health information is increasingly informing care.
The researchers analyzed the initial postings of nearly 855 patients and caregivers who commented in the Cancer Survivors Network online peer-support forum between 2006 and 2016. They applied machine learning, using simple natural language-processing techniques, to build a model that decided whether each posting fell into one or more of 12 categories, including physical, psychological/emotional, family-related, social, interpersonal/intimacy, practical, daily living, spiritual/existential, health information, patient-clinician communication, cognitive needs and miscellaneous.
The model used bag-of-words features, considering each word in a posting for its potential in classifying needs. The researchers identified important features for each need category using mathematical analysis and performance metrics.
They found that the most frequently occurring needs across postings were health information (456 posts), social (307 posts), psychological/emotional (141 posts) and physical (109 posts), of which physical, psychological, health and social needs were identified most accurately by the machine learning model.
Less frequently occurring categories were miscellaneous (74 posts), family-related (53 posts), practical (35 posts), patient-clinician communication (19 posts), interpersonal/intimacy (14 posts), spiritual/existential (10 posts), daily living (five posts) and cognitive (four posts).
Of all the postings, 38% described multiple needs, and of those, 40% described social and informational needs together.
Words describing psychological states, such as “anger” and “anxiety,” were important features for the classification of psychological/emotional and social needs, and medical terms, such as “endoscopy” and “colonoscopy,” were predictive that a post would focus on physical and informational needs.
The researchers concluded that even simple programs for word analysis can detect patient and caregiver needs with a high degree of accuracy, and that the same exercises can predict multiple needs at once, highlighting the need for future use of these models to help patients and their caregivers.