Tracked Robot Control with Hand Gesture Based on MediaPipe

Authors

  • Marthed Wameed Department of Mechatronics Engineering/ Al-khwarizmi College of Engineering/University of Baghdad/ Baghdad/Iraq
  • Ahmed M. ALKAMACHI Department of Mechatronics Engineering/ Al-khwarizmi College of Engineering/University of Baghdad/ Baghdad/Iraq
  • Ergun Erçelebi Department of Electric and Electronic Engineering/ Gaziantep University/ Turkey

DOI:

https://doi.org/10.22153/kej.2023.04.004

Abstract

Hand gestures are currently considered one of the most accurate ways to communicate in many applications, such as sign language, controlling robots, the virtual world, smart homes, and the field of video games. Several techniques are used to detect and classify hand gestures, for instance using gloves that contain several sensors or depending on computer vision. In this work, computer vision is utilized instead of using gloves to control the robot's movement. That is because gloves need complicated electrical connections that limit user mobility, sensors may be costly to replace, and gloves can spread skin illnesses between users. Based on computer vision, the MediaPipe (MP) method is used. This method is a modern method that is discovered by Google. This method is described by detecting and classifying hand gestures by identifying 21 three-dimensional points on the hand, and by comparing the dimensions of those points. This is how the hand gestures are classified. After detecting and classifying the hand gestures, the system controls the tracked robot through hand gestures in real time, as each hand gesture has a specific movement that the tracked robot performs. In this work, some important paragraphs concluded that the MP method is more accurate and faster in response than the Deep Learning (DL) method, specifically the Convolution Neural Network (CNN). The experimental results shows the accuracy of this method in real time through the effect of environmental elements decreases in some cases when environmental factors change. Environmental elements are such light intensity, distance, and tilt angle (between the hand gesture and camera).The reason for this is that in some cases, the fingers are closed together, and some fingers are not fully closed or opened and the accuracy of the camera used is not good with the changing environmental factors. This leads to the inability of the algorithm used to classify hand gestures correctly (the classification accuracy decrease), and thus response time of the tracked robot's movement increases. That does not present possibility for the system to determine whether the finger is closed or opened.

Downloads

Download data is not yet available.

References

S. Amaliya, A. N. Handayani, M. I. Akbar, H. W. Herwanto, O. Fukuda, and W. C. Kurniawan, “Study on Hand Keypoint Framework for Sign Language Recognition,” 7th Int. Conf. Electr. Electron. Inf. Eng. Technol. Breakthr. Gt. New Life, ICEEIE 2021, pp. 3–8, 2021, doi: 10.1109/ICEEIE52663.2021.9616851.

F. Hardan and A. R. J. Almusawi, “Developing an Automated Vision System for Maintaing Social Distancing to Cure the Pandemic,” Al-Khwarizmi Eng. J., vol. 18, no. 1, pp. 38–50, 2022, doi: 10.22153/kej.2022.03.002.

Y. G. Khidhir and A. H. Morad, “Comparative Transfer Learning Models for End-to-End Self-Driving Car,” Al-Khwarizmi Eng. J., vol. 18, no. 4, pp. 45–59, 2022, doi: 10.22153/kej.2022.09.003.

A. Osipov and M. Ostanin, “Real-time static custom gestures recognition based on skeleton hand,” 2021 Int. Conf. "Nonlinearity, Inf. Robot. NIR 2021, pp. 1–4, 2021, doi: 10.1109/NIR52917.2021.9665809.

A. Mujahid et al., “Real-time hand gesture recognition based on deep learning YOLOv3 model,” Appl. Sci., vol. 11, no. 9, 2021, doi: 10.3390/app11094164.

M. Al-Hammadiet al., “Deep learning-based approach for sign language gesture recognition with efficient hand gesture representation,” IEEE Access, vol. 8, pp. 192527–192542, 2020, doi: 10.1109/ACCESS.2020.3032140.

H. Y. Chung, Y. L. Chung, and W. F. Tsai, “An efficient hand gesture recognition system based on deep CNN,” Proc. IEEE Int. Conf. Ind. Technol., vol. 2019-February, pp. 853–858, 2019, doi: 10.1109/ICIT.2019.8755038.

Y. S. Tan, K. M. Lim, and C. P. Lee, “Hand gesture recognition via enhanced densely connected convolutional neural network,” Expert Syst. Appl., vol. 175, no. November 2020, p. 114797, 2021, doi: 10.1016/j.eswa.2021.114797.

R. Ahuja, D. Jain, D. Sachdeva, A. Garg, and C. Rajput, “Convolutional neural network based American sign language static hand gesture recognition,” Int. J. Ambient Comput. Intell., vol. 10, no. 3, pp. 60–73, 2019, doi: 10.4018/IJACI.2019070104.

P. Nakjai and T. Katanyukul, “Hand Sign Recognition for Thai Finger Spelling: an Application of Convolution Neural Network,” J. Signal Process. Syst., vol. 91, no. 2, pp. 131–146, 2019, doi: 10.1007/s11265-018-1375-6.

A. G. Mahmoud, A. M. Hasan, and N. M. Hassan, “Convolutional neural networks framework for human hand gesture recognition,” Bull. Electr. Eng. Informatics, vol. 10, no. 4, pp. 2223–2230, 2021, doi: 10.11591/EEI.V10I4.2926.

T. B. Waskito, S. Sumaryo, and C. Setianingsih, “Wheeled Robot Control with Hand Gesture based on Image Processing,” Proc. - 2020 IEEE Int. Conf. Ind. 4.0, Artif. Intell. Commun. Technol. IAICT 2020, pp. 48–54, 2020, doi: 10.1109/IAICT50021.2020.9172032.

P. N. Huu, Q. T. Minh, and H. L. The, “An ANN-based gesture recognition algorithm for smart-home applications,” KSII Trans. Internet Inf. Syst., vol. 14, no. 5, pp. 1967–1983, 2020, doi: 10.3837/tiis.2020.05.006.

B. J. Boruah, A. K. Talukdar, and K. K. Sarma, “Development of a Learning-aid tool using Hand Gesture Based Human Computer Interaction System,” 2021 Adv. Commun. Technol. Signal Process. ACTS 2021, pp. 2–6, 2021, doi: 10.1109/ACTS53447.2021.9708354.

S. Adhikary, A. K. Talukdar, and K. Kumar Sarma, “A Vision-based System for Recognition of Words used in Indian Sign Language Using MediaPipe,” Proc. IEEE Int. Conf. Image Inf. Process., vol. 2021-Novem, pp. 390–394, 2021, doi: 10.1109/ICIIP53038.2021.9702551.

A. Halder and A. Tayade, “Real-time Vernacular Sign Language Recognition using MediaPipe and Machine Learning,” Int. J. Res. Publ. Rev., no. 2, pp. 9–17, 2021, [Online]. Available: www.ijrpr.com.

R. Meena Prakash, T. Deepa, T. Gunasundari, and N. Kasthuri, “Gesture recognition and finger tip detection for human computer interaction,” Proc. 2017 Int. Conf. Innov. Information, Embed. Commun. Syst. ICIIECS 2017, vol. 2018-Janua, pp. 1–4, 2018, doi: 10.1109/ICIIECS.2017.8276056.

Indriani, M. Harris, and A. S. Agoes, “Applying Hand Gesture Recognition for User Guide Application Using MediaPipe,” Proc. 2nd Int. Semin. Sci. Appl. Technol. (ISSAT 2021), vol. 207, no. Issat, pp. 101–108, 2021, doi: 10.2991/aer.k.211106.017.

D. A. Taban, A. Al-Zuky, S. H. Kafi, A. H. Al-Saleh, and H. J. Mohamad, “Smart Electronic Switching (ON/OFF) System Based on Real-time Detection of Hand Location in the Video Frames,” J. Phys. Conf. Ser., vol. 1963, no. 1, 2021, doi: 10.1088/1742-6596/1963/1/012002.

GitHub, “MediaPipe on GitHub.” https://google.github.io/mediapipe/solutions/hands (accessed Sep. 20, 2022).

R. E. Valentin Bazarevsky and Fan Zhang, “On-Device, Real-Time Hand Tracking with MediaPipe,” MONDAY, AUGUST 19, 2019. https://ai.googleblog.com/2019/08/on-device-real-time-hand-tracking-with.html (accessed Oct. 01, 2022).

F. Zhang et al., “MediaPipe Hands: On-device Real-time Hand Tracking,” 2020, [Online]. Available: http://arxiv.org/abs/2006.10214.

C. Lugaresiet al., “MediaPipe: A Framework for Perceiving and Processing Reality,” Google Res., pp. 1–4, 2019.

Downloads

Published

2023-09-01

How to Cite

Tracked Robot Control with Hand Gesture Based on MediaPipe. (2023). Al-Khwarizmi Engineering Journal, 19(3), 56-71. https://doi.org/10.22153/kej.2023.04.004

Publication Dates