Edge-Optimized Facial Emotion Recognition: A High-Performance Hybrid Mobilenetv2-Vit Model

Authors

  • Susmith Barigidad Lam Research USA. Author

DOI:

https://doi.org/10.63282/3050-9416.IJAIBDCMS-V6I2P101

Keywords:

Computer vision application, MobileNetV2, Facial Emotion Recognition, Vision Transformer, Haar Cascade algorithm

Abstract

Computer vision applications span various fields including healthcare, security, autonomous vehicles, and augmented reality, enabling machines to interpret and analyze visual data. Facial Emotion Recognition (FER) is a subclass of healthcare applications that leverages computer vision to analyze and interpret human emotions from facial expressions. Facial emotion recognition also plays a vital role in human-computer interaction, with applications in security, and affective computing. This study suggests a deep learning (DL) based hybrid model integrating MobileNetV2 for efficient feature extraction and a Vision Transformer (ViT) for capturing global facial dependencies. The dataset obtained from Kaggle is used for training, which is then preprocessed and augmented. The trained model is deployed on a smartphone as an edge device, enabling real-time emotion recognition with improved privacy, low latency, and minimal computational overhead. During testing, facial images captured by the smartphone are preprocessed using the Haar Cascade algorithm before being fed into the model for classification. Performance evaluation using accuracy, recall, precision and F1-score demonstrates a high classification accuracy of 98.51%, confirming the model’s effectiveness. The proposed approach enhances on-device FER capabilities, making it a promising solution for emotion-aware applications in mobile healthcare and intelligent human-computer interactions

References

[1] Rezaee, K., Rezakhani, S. M., Khosravi, M. R., & Moghimi, M. K. (2024). A survey on deep learning-based real-time crowd anomaly detection for secure distributed video surveillance. Personal and Ubiquitous Computing, 28(1), 135-151.

[2] Zadeh, E. K., & Alaeifard, M. (2023). Adaptive Virtual Assistant Interaction through Real-Time Speech Emotion Analysis Using Hybrid Deep Learning Models and Contextual Awareness. International Journal of Advanced Human Computer Interaction, 1(1), 1-15.

[3] Wu, H., Li, X., & Deng, Y. (2020). Deep learning-driven wireless communication for edge-cloud computing: opportunities and challenges. Journal of Cloud Computing, 9(1), 21.

[4] Zhang, H., Jolfaei, A., & Alazab, M. (2019). A face emotion recognition method using convolutional neural network and image edge computing. IEEE Access, 7, 159081-159089.

[5] Yang, J., Qian, T., Zhang, F., & Khan, S. U. (2021). Real-time facial expression recognition based on edge computing. IEEE Access, 9, 76178-76190.

[6] Chen, A., Xing, H., & Wang, F. (2020). A facial expression recognition method using deep convolutional neural networks based on edge computing. Ieee Access, 8, 49741-49751.

[7] Hossain, M. S., & Muhammad, G. (2019). Emotion recognition using secure edge and cloud computing. Information Sciences, 504, 589-601.

[8] Wu, Y., Zhang, L., Gu, Z., Lu, H., & Wan, S. (2023). Edge-AI-driven framework with efficient mobile network design for facial expression recognition. ACM Transactions on Embedded Computing Systems, 22(3), 1-17.

[9] Pascual, A. M., Valverde, E. C., Kim, J. I., Jeong, J. W., Jung, Y., Kim, S. H., & Lim, W. (2022). Light-FER: a lightweight facial emotion recognition system on edge devices. Sensors, 22(23), 9524.

[10] Makhmudkhujaev, F., Abdullah-Al-Wadud, M., Iqbal, M. T. B., Ryu, B., & Chae, O. (2019). Facial expression recognition with local prominent directional pattern. Signal Processing: Image Communication, 74, 1-12.

[11] Ajay, B. S., & Rao, M. (2021, February). Binary neural network based real time emotion detection on an edge computing device to detect passenger anomaly. In 2021 34th International Conference on VLSI Design and 2021 20th International Conference on Embedded Systems (VLSID) (pp. 175-180). IEEE.

[12] Xu, G., Yin, H., & Yang, J. (2020, December). Facial expression recognition based on convolutional neural networks and edge computing. In 2020 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS) (pp. 226-232). IEEE.

[13] Pathak, R., & Singh, Y. (2020, October). Real time baby facial expression recognition using deep learning and IoT edge computing. In 2020 5th International conference on computing, communication and security (ICCCS) (pp. 1-6). IEEE.

[14] Wang, S., Qu, J., Zhang, Y., & Zhang, Y. (2023). Multimodal emotion recognition from EEG signals and facial expressions. IEEE Access, 11, 33061-33068.

[15] Chaudhari, A., Bhatt, C., Krishna, A., & Mazzeo, P. L. (2022). ViTFER: facial emotion recognition with vision transformers. Applied System Innovation, 5(4), 80.

[16] Umer, S., Rout, R. K., Pero, C., & Nappi, M. (2022). Facial expression recognition with trade-offs between data augmentation and deep learning features. Journal of Ambient Intelligence and Humanized Computing, 1-15.

[17] https://www.kaggle.com/datasets/shareef0612/ckdataset

[18] Dong, K., Zhou, C., Ruan, Y., & Li, Y. (2020, December). MobileNetV2 model for image classification. In 2020 2nd International Conference on Information Technology and Computer Application (ITCA) (pp. 476-480). IEEE.

[19] Li, J., Yan, Y., Liao, S., Yang, X., & Shao, L. (2021). Local-to-global self-attention in vision transformers. arXiv preprint arXiv:2107.04735.

Downloads

Published

2025-04-03

Issue

Section

Articles

How to Cite

1.
Barigidad S. Edge-Optimized Facial Emotion Recognition: A High-Performance Hybrid Mobilenetv2-Vit Model. IJAIBDCMS [Internet]. 2025 Apr. 3 [cited 2025 Sep. 14];6(2):1-10. Available from: https://ijaibdcms.org/index.php/ijaibdcms/article/view/113