Honors Thesis 2018 - Kaixin Ma

Challenge Reading Comprehension on Daily Conversation: Passage Completion on Multiparty Dialog

Kaixin Ma

Highest Honor in Computer Science


Abstract

This thesis expands a previously constructed corpus and presents a robust deep learning architecture for a task in reading comprehension, passage completion, on multiparty dialog. Given a dialog in text and a passage containing factual descriptions about the dialog where mentions of the characters are replaced by blanks, the task is to fill the blanks with the most appropriate character names that reflect the contexts in the dialog. Previous researcher constructed a dataset by selecting transcripts from a TV show, generating passages for each dialog through crowdsourcing, and annotating mentions of characters in both the dialog and the passages. This work expands the previously constructed dataset following the same pipeline and fixes errors in the entire dataset. Given this dataset, a deep neural model is developed that integrates rich feature extraction from convolutional neural networks (CNN) into sequence modeling in recurrent neural networks (RNN), optimized by utterance and dialog level attentions. The model outperforms the previous state-of-the-art model on this task in a different genre using bidirectional LSTM, showing a 13.0+% improvement for longer dialogs. The analysis shows the effectiveness of the attention mechanisms and suggests a direction to machine comprehension on multiparty dialog.

Department / School

Computer Science / Emory University

Degree / Year

BS / Spring 2018

Committee

Jinho D. Choi, Computer Science and QTM, Emory University (Chair)
Ken Mandelberg, Computer Science, Emory University
Connie Roth, Physics, Emory University

Links

Anthology | Paper | Presentation