| Abstract |
Abstract Objective: To analyze the relationship between the bio-acoustic speech features and emotional states among the neurocognitive function subtypes of adolescents. Methods: The study included 442 adolescents aged 12 to 17, including 208 male and 234 female students. The Chinese version of the MATRICS cognitive test was used to assess the subjects' performance in 6 cognitive domains, including information processing speed, language learning and memory, working memory, reasoning and problem solving ability, visual learning and memory, attention/alertness, and 1 social cognition (emotion management) ability. Senior psychiatrists used the Hamilton Anxiety Scale and depression scale to evaluate the anxiety and depression of the subjects. The subjects used the self-rated 9-item Patient Health Questionnaire (PHQ-9) and the 7-item Generalized Anxiety Disorder Scale (GAD-7) to assess their depression and anxiety. Subjects read three paragraphs related to positive, neutral and negative emotional stimuli and collect their speech with a speech acquisition device. Bioacoustic features were extracted using OpenSmile software: Root Mean Square Energy, Meir spectrum cepstrum coefficients in 12 dimensions, Zero Crossing Rate, Voice Probability and Fundamental Frequency. A hierarchical clustering method was used to detect cognitive subtypes according to the cognitive function level of subjects. The ARI (Adjusted Rand Index) values of 2 to 6 cognitive subtypes were calculated respectively, and the average ARI values of 2 to 6 cognitive subtypes were calculated using 10-fold cross-validation (random sampling 5000 times) to evaluate the clustering stability. The differences of bioacoustic speech features and emotional states among different cognitive subtypes were measured. Results: In hierarchical cluster analysis, the average 10-fold crossover ARI value (0.4838) of cognitive cluster II subtypes was higher than that of cognitive cluster III, IV, V and VI subtypes. Subtype 1 and subtype 2 of cognitive cluster II performed better than subtype 2 in all other cognitive domains except working memory and social cognition. Subjects of cognitive subtype 1 had a lower Zero Crossing Rate when reading positive documents than subtype 2 (F=4.768, P=0.03), a lower Zero Crossing Rate when reading neutral documents (F=4.846 P=0.028), and a higher value of the Mel-Frequency Cipstal Coefficients-1 than cognitive subtype 2 (F=4.69 P=0.031). There was no significant difference in emotional state between subtypes 1 and 2. Conclusion: In this study, for the first time, it was found that the bio-acoustic speech features of individuals with different cognitive function levels were significantly different in healthy adolescents while reading texts related to positive and neutral emotional stimulation tasks. The Meir spectrum cepstrum coefficients-1 and Zero Crossing Rate may be helpful for the differentiation and identification of cognitive function levels in adolescents.
|