Honors Thesis 2022 - Yingying Chen

Analysis of Temporal Relations in Various Types of Text Data

Yingying Chen

Highest Honor in Quantitative Theory and Methods


Abstract

Detecting the temporal relations of events in a text is a complicated natural language understanding task. However, figuring out the timeline of events is key to improving machine comprehension. Previous work specified approaches to identifying events in texts, proposing appropriate temporal relations and ways to order events with respect to one another. However, the vast majority of existing temporal dependency annotation has been carried out on simple narrative text or news sources. The annotation schemes are not always applicable to noisy, highly variable, social media texts such as Reddit posts. We devise a more generalized and robust scheme to support a broader range of text annotation. In this research, we aim to 1) improve existing annotation guidelines for more complex sentence structures, 2) evaluate the annotation performance among student annotators to achieve competitive inter-annotator agreement scores, 3) quantify the characteristics unique to Reddit text and provide a statistical analysis of the difficulties encountered when annotating Reddit data, and 4) compare and contrast the effectiveness of our temporal annotation scheme across three diverse sources: children’s stories, social media texts, and news articles. The results show that our annotation scheme is effective in identifying events with high-level inter-annotator agreement scores, but there is still space to improve for identifying timelines of events. Besides, our results show the challenges of generating a unifying temporal relations scheme for different types of text. These challenges lead to the discussion of how to evaluate the effectiveness of temporal relation schemes.

Department / School

Quantitative Theory and Methods (Linguistics) / Emory University

Degree / Year

BS / Spring 2022

Committee

Jinho D. Choi, Computer Science and QTM, Emory University (Chair)
Marjorie Pak, Linguistics, Emory University
Jason McLarty, Linguistics, Emory University

Links

Anthology | Paper | Presentation

Yingying Chen, Jinho Choi, Marjorie Pak, Jason McLarty