Classification Models for Predicting Mouse Spinal Motoneuron Physiological Type based on their Electrical Properties

Reuben Mawuena Kwadzo Ahorklo, University of Rhode Island

Abstract

Accurate prediction of mouse spinal motoneuron physiological types based on electrical properties faces challenges due to missing data and imbalanced class distributions. Technical difficulties, physiological variations, and experimental issues contribute to data gaps in electrophysiological recordings. Imbalanced class distributions arise from the rarity of certain motoneuron types. The resulting risk of biased or unreliable classification models hampers their utility in motor control studies. Thus, we claim that the electrical properties of mouse spinal motoneurons can be accurately predicted and classified based on specific, measurable features.

This study focuses on two classification models, a multinomial logistic regression (MLM) and a Random Forests (RF) model, to predict motoneuron physiological types based on electrical properties since motoneurons' vital role in signal transmission relies on diverse electrical properties. Both model types are applicable to more-than-two-class problems, and MLM excels in subtle pattern identification, while RF handles complex relationships within the data. We investigate the impact of the threshold choices on the class distribution, class-specific, and overall prediction accuracy. This analysis showed that the model's performance in terms of accuracy depends on the threshold set. Next, we incorporated the over-sampling technique and hotdeck imputation to compensate for class imbalances and missing data. While contingent on selecting an appropriate threshold, the results illustrate that imputation and oversampling offer notable benefits by preserving data size, thereby enhancing the accuracy and stability of classification models. Specifically, when we set the contraction time to 20mNs and twitch amplitude to 8mNs, we demonstrated that incorporating imputation techniques for handling missing data and utilizing resampling methods to address class imbalances significantly enhances the overall accuracy of multinomial logistic model to 0.78. Class-specific accuracy ranges between 0.64 and 0.88, contributing to the robustness of motor unit classification based on electrical properties. The emphasis on managing missing data, addressing imbalanced class distributions, and understanding the predictor-response relationship (Y) guided the preference for MLM. The decision to exclude other models was based on data characteristics, small sample size, and specific project goals. This research advances our understanding of motor control and suggests potential clinical applications in diagnosing and treating motor neuron diseases, such as amyotrophic lateral sclerosis (ALS). In ALS, determining physiological types relies on contractile properties rather than traditional measures like computed twitch contraction or amplitude time.