Fine-tuned RetinaNet models for Vision-based Human Presence Detection

Authors

  • Jin Cheng Tang Faculty of Manufacturing and Mechatronics Engineering Technology, Universiti Malaysia Pahang, 26600 Pahang, Malaysia.
  • Ahmad Fakhri Bin Ab. Nasir Faculty of Computing, Universiti Malaysia Pahang, 26600 Pekan Pahang, Malaysia.
  • Anwar P. P. Abdul Majeed School of Robotics, XJTLU Entrepreneur College (Taicang), Xi’an Jiatong-Liverpool University, Suzhou, 215123, P. R. China
  • Li Lim Thai TT Vision Holdings Berhad, 11900 Plot 106, Sungai Hilir Keluang 5, Bayan Lepas, FIZ.4, Bayan Lepas, Pulau Pinang, Malaysia.
  • Mohd Azraai Mohd Razman Faculty of Manufacturing and Mechatronics Engineering Technology, Universiti Malaysia Pahang, 26600 Pahang, Malaysia.
  • Ismail Mohd Khairuddin Faculty of Manufacturing and Mechatronics Engineering Technology, Universiti Malaysia Pahang, 26600 Pahang, Malaysia.

DOI:

https://doi.org/10.15282/mekatronika.v4i2.8850

Keywords:

Human Detection, Deep Learning, Transfer Learning, Fine-tuning, RetinaNet

Abstract

Moving towards Industry 4.0, the idea of human-robot interaction (HRI) and human-robot collaboration (HRC) has been popularized. To introduce more robots into the industries, risk-correlated issues would be always on the hook as robots are not as flexible as human. In fact, although robots can replace human workers in some of the dangerous tasks, still human safety is always the top priority for all industries. The most common way to safeguard the human was to isolate the working space of human workers and robots. To realize the idea of Industry 4.0, it is postulated to have the robots and cobots out of the cage to maximize productivity and efficiency. Hence, studies have been conducted with the attempts to free the robots from the isolated working space while preserve the safety of human operators. The present study seeks to explore the feasibility of transfer learning strategy — fine-tuning to human presence detection tasks as the base of practicing safe HRI. A custom image dataset with 1463 images was collected and separated into train, validation, and test set with a ratio of 70:20:10. Three RetinaNet object detection models with different backbone networks were fine-tuned with the acquired dataset to transfer the knowledge learned from source domain to the target domain, which is the human presence detection tasks. The result has shown that the RetinaNet_ResNet152-V1-FPN has the highest test AP of 74.4% with an inference speed of 13.09 FPS, suggesting that it is the best fine-tuned RetinaNet models. This study has demonstrated the feasibility of using fine-tuning as the strategy to train the object detection models, which can possibly act as the base for improving HRI applications via a deep learning visual-based method. In summary, the research has signified the uses of deep learning models to perform human presence detections and can be further extended for HRI safety applications.

Downloads

Published

2022-11-20

How to Cite

[1]
J. C. Tang, A. F. B. Ab. Nasir, A. P. P. Abdul Majeed, L. L. Thai, M. A. Mohd Razman, and I. Mohd Khairuddin, “Fine-tuned RetinaNet models for Vision-based Human Presence Detection”, MEKATRONIKA, vol. 4, no. 2, pp. 16–23, Nov. 2022.

Issue

Section

Original Article

Most read articles by the same author(s)

1 2 3 > >>