PhD 2021F: Han He

Encode Linguistic Structures into Pre-trained Seq2seq Transformers

Han He

Date: 2021-10-01 / 3:00 ~ 4:00 PM


Over the past few years, the number of AMR parsing and application papers has increased exponentially in size. Consequently, there is a great need for a larger corpus to facilitate better parsing performance. Unfortunately, manually annotation is an expensive and labor-intensive procedure and hence we propose an end-to-end method to convert existing richly-annotated corpus into AMR graphs. Specifically, we propose to encode linguistic structures into the encoder of a pre-trained seq2seq transformer and use its decoder for AMR parsing. We will demonstrate the capability to encode and decode structures using solely a seq2seq model. This allows us to both gather large amounts of AMR graphs in very little time and obtain very good annotation accuracy.