MS Thesis 2021 - Xiangjue Dong

Analysis of Graph-Based Semi-Structured Categorical Model for Competence-Level Classification

Xiangjue Dong


Abstract

Transformer-based models have been widely used for many natural language processing tasks and shown excellent capability in capturing contextual information especially for document classification. Many existing transformer-based methods, however, even treat semi-structured text data as a block of text. These  methods tend to ignore the hierarchical information and semantic correlations hidden in semi-structured text data, which can be captured by graph-based network models. This paper proposes a novel graph representation of semi-structured resume data that considers the categorical and hierarchical relationship in resumes.  Our experiments show that our graph-based models outperform transformer methods for resume classification tasks and show better interpretability and generalization.

Department / School

Computer Science / Emory University

Degree / Year

MS / Spring 2021

Committee

Jinho D. Choi, Computer Science and QTM, Emory University (Chair)
Carl Yang, Computer Science,Emory University
Abeed Sarker, Biomedical Informatics, Emory University

Links

Anthology | Paper | Presentation