2022-06-18: ADHD Prediction Through Analysis of Eye Movements With Graph Convolution Network

Since processing speech with background noise requires appropriate parsing of the distorted auditory signal, individuals with attention deficit hyperactivity disorder (ADHD) may have difficulty processing speech with background noise due to reduced inhibitory control and working memory capacity. We conducted a study (Jayawardena et al.) by utilizing Audiovisual speech-in-noise (SIN) performance and eye-tracking measures of young adults with ADHD compared to age-matched controls for ADHD evaluation. In this study, there was five ADHD participants and six non-ADHD participants. We utilized eye tracking data recorded using a Tobii Pro X2-60 computer screen-based eye tracker. Each participant was told to watch a computer screen where a female speaks sentences out loud as levels of background noise varies and asked to repeat the sentences exactly as they heard them. The task consisted of varying six levels of background noise: 0 to 25 dB.

Each participant was presented with nine sentences in each trial. There were two sentences sets each with nine sentences, but different from one another. Some participants at each background level listened to one set of sentences where some participants listened to the other set of sentences. In each trial, participants completed audiovisual a SIN task accompanied by one of the six background noise levels and one of the sentences out of nine sentences. There were 830 such trials in this study. 

In this blog, I discuss how we utilize graphs generated using eye-tracking data for the diagnosis of ADHD. For this study, we formed six graphs utilizing eight eye-tracking features we generated.

Graph Construction

Pre-process Eye-Tracking Data

Initially, we utilized eye-tracking data from Jayawardena et al., and pre-processed it to generate following gaze metrics using the raw eye movements for the entire trial: (1) fixation count, (2) average fixation duration, (3) total fixation duration, (4) standard deviation of fixation duration, (5) max saccade peak velocity, (6) min saccade amplitude, (7) average saccade amplitude, and (8) standard deviation of saccade amplitude. For analysis of eye movements, we first extracted (x,y) coordinates of eye movements along with timestamps and pupil dilation from raw gaze data exported from the Tobii Pro X2-60 computer screen-based eye tracker. Then, we classified the extracted raw data into fixations and saccades. Next, we calculated aforementioned fixations and saccades statistics for each trail. Altogether we had 830 instances of fixations and saccades statistics because we had 830 trials.

Graphs

We formed six graphs utilizing the eight eye-tracking features we generated and connection among trials in terms of subject, background noise, and sentence. We created six un-directed multi-graphs, each with 830 nodes, and with each node corresponding to a trial.

Nodes

An example graph we generated is an un-directed graph G = (N, E) where nodes are N = {1, 2, 3, 4, ... , 830}, and links between pair of nodes are E = {(1, 2), (1, 3), (1, 4), ...}. Here, nodes correspond to trials. Each trial is defined by a participant, background noise-level, and the sentence presented to the participant. 

Node Features

Since we consider each trial as a node of our graph, for each node, we created a feature matrix utilizing the eight eye gaze metrics we calculated.  The following is an example node features entry, out of 830 entries.

"k": [ 6, 220, 3868, 228, 186, 0, 53, 6]

Here, k is the trial/node ID. Each value corresponds to eye gaze metrics; fixation count, average fixation duration, total fixation duration, standard deviation of fixation duration, max saccade peak velocity, min saccade amplitude, average saccade amplitude, and standard deviation of saccade amplitude respectively.

Edges

Links between pair of nodes mean that they belong to the same edge category. We introduced different types of edge categories:

1. Same Background Noise Level: This links all trials performed (sentences read) by the all subjects at the same noise level

2. Same Subject: This links all trials performed by the same subjects regardless the noise level

3. Same Sentence: This links all trials performed by all subjects when presented with the same sentence regardless the noise level

4. Same Background Noise Level and Same Subject: This links all trials performed by the same subject at the same noise level

5. Same Subject and Same Sentence: This links all trials performed by the same subject at the same sentence regardless the noise level

6. Same Background Noise Level and Same Sentence: This links all trials performed by different subjects at the same noise level and when presented with the same sentence

In Same Background Noise Level edge category, we considered a trial to be adjacent to another trial if they belong to the same background noise level regardless the subjects who performed the trail and regardless the sentence they were presented with when carrying out that trial. This edge category had a file consisting of two columns: (1) trial_id_1, and (2) trial_id_2. Each row of this file indicates that those two trials are linked according to this edge category. Each edge (connection) is considered as not weighted in our graph. Similar to the Same Background Noise Level, edge category, a trial is adjacent to another trial if they belong to a specified edge category.

Target of each Node

We associated a binary value (0 and 1) as the target of each node/trial. If a trial is associated with a non-ADHD participant, we marked that node's target as 0. Similarly, if a trial is associated with an ADHD participant, we marked that node's target as 1. 

Graph Convolutional Network (GCN) model for ADHD Detection

Graph network model can be utilized for graph data processing. GCN can use both graph and node feature information to capture the neighborhood information from graphs by incorporating node features into the embeddings. We applied GCN to classify our nodes (which are trials of the experiment) into ADHD or non-ADHD. For this purpose, we used Stellargraph library and their implementation of GCN. 

Our goal is to have multiple GCN models trained on the different types of edges we created. We read the node features file as a features matrix of (830, 8) where 830 is the number of nodes and 8 is the number of features of each node. Then, we read in targets file. According to our data, about 67.60% of nodes belong to non-ADHD participants and and 32.40% of nodes belong to ADHD participants.

Since Stellargraph has its own graph data structure, we transformed our data into StellarGraph by providing the node features and edges (one at a time) to the StellarGraph function. For the different edge categories, we used StellarGraph to generated multiple Undirected multi-graphs with 830 nodes and varying number of edges. 

Graph Data Pre-processing 

To feed the data to the GCN we aggregated the feature information from the neighboring nodes through adjacency matrices. An adjacency matrix denotes connections between the nodes. By multiplying node features with the normalized adjacency matrix, we aggregated the neighboring features. 

GCN Model

We used node embeddings and adjacency matrix representations as the input of the GCN model. The GCN layer is a multiplication of inputs, weights, and the normalized adjacency matrix. We utilized the GCN layer from Stellargraph. We initialized the input layers with the correct shapes to receive our three inputs: (1) features matrix, (2) train, validation, test indices, and (3) normalized adjacency matrix for each edge category. Then, we built our model with two GCN dropout layers. Each layer has 32 nodes. We used Adam optimizer (lr = 0.01) and binary cross-entropy as our loss function. We trained our model for 200 epochs.

Proposed Model Parameters with GCN Layers

Results

To evaluate the the performance of models corresponding to each edge category, we report Area Under the ROC Curve (ROC AUC), average precision, and F1 score for all edge categories that we trained our model on.

Comparison of Accuracy for binary categorization of each NASA-TLX measure for different ML approaches

From the results we observed that only the "Same Background Noise Level and Same Subject" edge category was able to give slightly better results in terms of AUC ROC and Precision. In terms of F1-score, the "Same Background Noise Level and Same Sentence" edge category was able to give slightly better results. We obtained around 0.5 ROC AUC score with 750 labelled trials, which indicates that our model does not learn a proper separation among ADHD and non-ADHD trials from the neighbors. 

Additionally, we visualized what the model has learned by accessing the embeddings before the classification layer to identify which model has the best separation between the two classes. From the embeddings, we observed that embeddings from edge categories "Same Subject and Same Sentence" and  "Same Subject" have more potential in classifying ADHD and non-ADHD trials. 

Embeddings from edge category: Same Subject

Embeddings from edge category: Same Subject and Same Sentence


Future work

In the future, we plan to use support vector machines on top of GCN layers to predict the diagnosis of ADHD based on the two edge categories ("Same Subject and Same Sentence",  "Same Subject") which showed a potential in classifying ADHD and non-ADHD trials. In Jayawardena et al., we utilized four areas of interest (AOIs) of the eye tracking stimulus: (1) AOI-1: left eye, (2)  AOI-2: right eye, (3) , AOI-3: nose, and (4) AOI-4: mouth, when deriving eye movement metrics. We also plan to incorporate AOI data as node features as well. We also plan to incorporate all adjacency matrices we created from six edge categories and train one GCN model, where it could learn distinct multiple relationships among the trial nodes and predict the diagnosis of ADHD.

Citation

  1. G. Jayawardena, A. Michalek, A. Duchowski, and S. Jayarathna, “Pilot study of audiovisual speech-in-noise (sin) performance of young adults with ADHD,” in ACM symposium on eye tracking research and applications, 2020. (DOI: 10.1145/3379156.3391373)
-- Gavindya Jayawardena (@Gavindya2)


Comments