Document-level Condition-Treatment Relation Extraction on Disease-related Social Media Forums

Sichang Tu, Stephen Doogan, Jinho D. Choi


Social media has become a popular platform where people share information about personal healthcare conditions, diagnostic histories, and medical plans.Analyzing posts on social media depicting such realistic information can help improve quality and clinical decision-making; however, the lack of structured resources in this genre limits us to build robust NLP models for meaningful analysis.This paper presents a new corpus annotating relations among many types of conditions, treatments, and their attributes illustrated in social media posts by patients and caregivers.For experiments, a transformer encoder is pretrained on 1M raw posts and used to train several document-level relation extraction models using our corpus.Our best-performing model achieves the F1 scores of 69.5 and 52.6 for Entity Recognition and Relation Extraction, respectively.These results are encouraging as it is the first neural model extracting complex relations of this kind on social media data.

Venue / Year

Proceedings of the EMNLP Workshop on Health Text Mining and Information Analysis / 2022


Anthology | Paper | Poster | BibTeX | GitHub