Computational semantics as a field includes many of the unsolved problems of Natural Language Processing. The need for innovation in this field has motivated much research in developing word and language models that better represent meaning and various concepts within semantics. This thesis is concerned specifically with measuring verb similarity and verb clustering, a task within this field. The goal is to develop representations of verbs that can accurately and viably be used to judge semantic similarity between verbs and to group verbs into classes that reflect their relatedness in meaning. Verb clustering - a task of distributing verbs into semantically related classes - has in previous research been shown to have applications in multiple tasks in Natural Language Processing including word sense disambiguation. This thesis will present and compare several methods of automatic acquisition of verb similarity, with a goal of allowing future applications of these methods in NLP tasks and to promote discoveries in how the mechanisms modeled by these methods relate to linguistics. This paper presents several methods from verb clustering based on Latent Dirichlet Allocation - a probabilistic graphical model commonly used for topic modelling. We model verbs as collections of contextual features derived from latent classes. LDA, which is designed as a model for Bayesian inference of latent thematic categories, fits well to model verb classes based on linguistic context. We demonstrate Recursive LDA, a procedure of executing LDA iteratively to produce a hierarchical structure of classes. We test several linguistic features from syntax and lexical arguments of verbs with interest in identifying how informative each feature is. We evaluate all of our experiments against human judgments of similarity providing a novel method for evaluating semantic similarity metrics of word models. We test all of our data on a list of 3,000 most common English verbs. We test our method against Word2Vec, a popular and recently developed word model using skip-gram feature vectors refined by deep learning. The results in this thesis will show that given the right features, our method of using LDA with linguistic features outperforms Word2Vec's data-driven statistical approach when weighed against human judgements.
Computer Science / Emory University
BS / Spring 2016
Jinho D. Choi, Computer Science and QTM, Emory University (Chair)
Marjorie Pak, Linguistics, Emory University
Phillip Wolff, Psychology, Emory University