Honors Thesis 2018 - Lindsay Hexter

Monkeying Around: Automatically Analyzing Malaria Infections in Rhesus Macaques

Lindsay Hexter

Highest Honor in Computer Science


Abstract

In today’s age of big data, automatic processing techniques are becoming more important than ever, especially in the field of biology and medicine. Many studies focus on genomic data, following the rise of high throughput sequencing; this project instead analyzes certain blood data parameters taken from rhesus macaques housed in Yerkes National Primate Research Center at Emory University.

The Joyner et al. 2016 paper, “Plasmodium cynomolgi infections in rhesus macaques display clinical and parasitological features pertinent to modelling vivax malaria pathology and relapse infections,” was the initial motivation for this study (Joyner et al., 2016). Joyner and his team follow the infection of malaria species P.cynomolgi in monkeys, taking blood data and other biological information daily. While the paper discusses possible points of difference between monkeys of varying disease severity, we endeavored to find an automatic way to use these “clinical and parasitological features” to characterize and predict aspects of malaria, including severity and stage of the longitudinal infection.

We propose to replicate existing analyses and to add new insights via various computational techniques. Machine learning is traditionally used for very large datasets, and thus this thesis intends to provide a proof of concept for automatically analyzing these types of smaller datasets, given restrictions studying monkeys. The flow of computation is as follows: normalization of data, creation of mathematical models, residual calculation, formation of residual matrices for clustering, and lastly the generation of regression models. The aforementioned procedure is then applied to shifted data for comparison, using Bayesian optimization. This study therefore provides a comprehensive framework for automatic analysis of medical data, which can be applied to other datasets.

Department / School

Computer Science / Emory University

Degree / Year

BS / Spring 2018

Committee

Jinho D. Choi, Computer Science and QTM, Emory University (Chair)
Davide Fossati, Computer Science, Emory University
Mary Galinski, Medicine, Emory University
Arri Eisen, Biology, Emory University
Astrid Prinz, Biology, Emory University

Links

Anthology | Paper | Presentation