PhD 2023S: Liyan Xu

Facilitate Document Understanding by Leveraging Language Structures

Liyan Xu

Date: 2023-02-24 / 4:00 ~ 5:00 PM
Location: MSC W301


This work explores the limitations of pretrained language models (PLMs) in effectively processing multi-sentence or multi-paragraph inputs for document understanding tasks. To overcome this challenge, the dissertation investigates the utilization of different intrinsic language structures, including syntactic, discourse, and knowledge-specific structures, to enhance context understanding. The dissertation presents four distinct works that demonstrate the effectiveness of incorporating these structures for machine reading comprehension, coreference resolution, and information extraction tasks. The empirical results of each experiment suggest that modeling these structures can complement the sequence modeling of PLMs and significantly improve performance on document-oriented tasks. Ultimately, this dissertation contributes to the research community's understanding of the potential benefits of leveraging language structures to advance natural language understanding.