Introduction: Advances in natural language processing, speech recognition and machine learning allow the exploration of linguistic and acoustic changes previously difficult to measure. We developed processes for deriving lexical-semantic and acoustic measures as Alzheimer’s disease (AD) digital voice biomarkers, and evaluated the clinical sensitivity of these derived measures against neurocognitive, neuroimaging, and CSF AD biomarker data.
Methods: We collected connected speech, neuropsychological test findings, neuroimaging (brain volume, connectivity, and CSF AD biomarker data (amyloid-β-1-42, total tau, phosphorylated tau) of 92 cognitively unimpaired (40 Aβ+) and 114 cognitively impaired (63 Aβ+) participants. Acoustic and lexical-semantic features were derived from audio recordings using machine learning approaches.
Results: Lexical-semantic (AUC=0.80) and acoustic scores (AUC=0.77) demonstrated higher diagnostic performance for detecting MCI compared to Boston Naming Test (AUC=0.66). Only lexical-semantic scores detected amyloid-β status (p=0.0003). Acoustic scores associated with hippocampal volume (p=0.017) while lexical-semantic scores associated with CSF amyloid-β (p=0.007). Both measures were significantly associated with 2-year disease progression and mapped to functional connectivity in AD-susceptible brain regions.
Discussion: These preliminary findings suggest that biomarkers derived from standardized audio recordings may identify persons with cognitive impairment due to preclinical or prodromal AD and may predict disease progression.