Transformer-based models have been widely used for many natural language processing tasks and shown excellent capability in capturing contextual information especially for document classification. Many existing transformer-based methods, however, even treat semi-structured text data as a block of text. These methods tend to ignore the hierarchical information and semantic correlations hidden in semi-structured text data, which can be captured by graph-based network models. This paper proposes a novel graph representation of semi-structured resume data that considers the categorical and hierarchical relationship in resumes. Our experiments show that our graph-based models outperform transformer methods for resume classification tasks and show better interpretability and generalization.
Computer Science / Emory University
MS / Spring 2020
Jinho D. Choi, Computer Science and QTM, Emory University (Chair)
Michelangelo Grigni, Computer Science, Emory University
Shun Yan Cheung, Computer Science, Emory University