Multi-task Visual Perception Method in Dragon Orchards Based on OrchardYOLOP
Abstract
In the face of challenges such as complex terrains, fluctuating lighting, and unstructured environments, modern orchard robots require the efficient processing of a vast array of environmental information. Traditional algorithms that sequentially execute multiple single tasks are limited by computational power which are unable to meet these demands. Aiming to address the requirements for realtime performance and accuracy in multitasking autonomous driving robots within dragon fruit orchard environments. Building upon the YOLOP, focus attention convolution module was introduced, C2F and SPPF modules were employed, and the loss function for segmentation tasks was optimized, culminating in the OrchardYOLOP. Experiments demonstrated that OrchardYOLOP achieved a precision of 84.1% in target detection tasks, an mIoU of 89.7% in drivable area segmentation tasks, and an mIoU increased to 90.8% in fruit tree region segmentation tasks, with an inference speed of 33.33 frames per second and a parameter count of only 9.67×106. Compared with the YOLOP algorithm, not only did it meet the real-time requirements in terms of speed, but also it significantly improved accuracy, addressing key issues in multi-task visual perception in dragon fruit orchards and providing an effective solution for multi-task autonomous driving visual perception in unstructured environments.
Keyword: dragon orchard ; multi-task ; visual perception ; semantic segmentation ; object detection ; YOLOP
Download Full Text:
PDFReferences
YAN Wanvu, HAN Xu, SHANG Yongqiang, et al. Analysis on the development trend and characteristics of digital fruit industry [ J ]. China Fruits, 2023 ( 1 ) ; 116 - 121. ( in Chinese)
DOU Hanjie, CHEN Zhenyu, ZHA1 Changyuan, et al. Research progress on autonomous navigation technology for orchard intelligent equipment [ J ]. Transactions of the Chinese Society for Agricultural Machinery ,2024 ,55 (4) ;1 -22. (in Chinese)
CHEN Qing, YIN Chengkai, GUO Ziliang, et al. Current status and future development of the key technologies for apple picking robots[J]. Transactions of the CSAE, 2023,39(4) :1 - 15. (in Chinese)
MO Dongyan, YANG Chenyu, HUANG Peichen, et al. Research progress of autonomous navigation technology for orchard robots based on environment perception [J] . Mechanical & Electrical Engineering Technology, 2021 ,50 ( 9 ) : 145 - 150. ( in Chinese)
ZHU Yun, LING Zhigang, ZHANG Yuqiang. Research progress and prospect of machine vision technology [ J ]. Journal of Graphics, 2020,41 (6) :871 -890. (in Chinese)
REN S Q,HE К M ,GIRSHICK R, et al. Faster R —CNN: towards real-time object detection with region proposal network [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6) ; 1 137 - 1 149.
CARION N, MASSA F, SYNNAVE G, et al. End-to-end object detection with transformers[ J] . Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2020,12346:213 -229.
REDMON J, FARHADI A. YOL09000; better, faster, stronger [ С ] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017 : 7263 -7271.
REDMON J, FARHADI A. YOLO v3 : an incremental improvement [J ]. arXiv Preprint, arXiv; 1804. 02767 , 2018.
WANG С Y, BOCHKOVSKIY A, LIAO H M. Scaled-YOLO v4; scaling cross stage partial network[ C]//Computer Vision and Pattern Recognition. IEEE, 2021.
WANG С Y, BOCHKOVSKIY A, LIAO II Y M. YOLO v7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[ С ]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7464 - 7475.
WANG С Y, BOCHKOVSKIY A, LIAO H Y M. You only learn one representation: unified network for multiple tasks[J]. arXiv Preprint, arXiv :2105. 04206,2021.
CHEN J, LU Y, YU Q, et al. Transunet; transformers make strong encoders for medical image segmentation [ J ]. arXiv Preprint, arXiv:2102. 04306 , 2021.
RONNEBERGER 0, FISCHER P, BROX T. U-net; convolutional networks for biomedical image segmentation[ С] // Medical Image Computing and Computer-assisted Intervention-MICCAI 2015: 18th International Conference, Munich, Germany, October 5 - 9, 2015, Proceedings, Part Щ 18. Springer International Publishing, 2015; 234 -241.
ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network [С ]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2881 -2890.
BACZMANSKI M, SYNOCZEK R, WASALA M,et al . Detection-segmentation convolutional neural network for autonomous vehicle perception [C]//2023 27th International Conference on Methods and Models in Automation and Robotics (MMAR). IEEE, 2023: 117 - 122.
VU D, NGO B, PHAN H. Hybridnets; end-to-end perception network [ J]. arXiv Preprint, arXiv :2203. 09035 , 2022.
DONG W, LIAO M W, ZHANG W T, et al. YOLOP: you only look once for panoptic driving perception [ J ]. Machine Intelligence Research, 2022 ,19( 6) ;550 - 562.
MENG Qingkuan , YANG Xiaoxia, ZHANG Man, et al. Recognition of unstructured field road scene based on semantic segmentation model [j]. Transactions of the CSAE, 2021 ,37(22) : 152 - 160. (in Chinese)
ZHOU Xuecheng, XIAO Mingwei, LIANG Yingkai, et al. Navigation path recognition between dragon orchard using improved DeepLabv3 + network [ J]. Transactions of the Chinese Society for Agricultural Machinery, 2023, 54 ( 9 ): 35 - 43. (in Chinese)
LI F, ZHANG H, SUN P Z, et al. Semantic-sam: segment and recognize anything at any granularity [ J ]. arXiv Preprint, arXiv:2307.04767, 2023.
BUSLAEV A, IGLOVIKOV I V, KHVEDCHENYA E, et al. Albumentations: fast and flexible image augmentations [J ]. Information ( Switzerland) , 2020 ,11(2) ; 125.
YU F, CHEN H F, WANG X, et al. BddlOOk; a diverse driving dataset for heterogeneous multitask learning [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020; 2636 -2645.
WOO S, PARK J, LEE J Y. CBAM; convolutional block attention module[j]. Lecture Notes in Computer Science, 2018, 11211:3 -19.
HE K, ZHANG X, REN S,et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, 37(9) :1904 - 1916.
Refbacks
- There are currently no refbacks.