Segmentation and 3D reconstruction of root systems and Humulus lupulus for high-throughput phenotyping using deep neural networks and 3D imaging
Abstract
The availability of artificial intelligence (AI) tools for computer vision applications has greatly increased in recent years, with many advanced tools available to researchers across various fields. Precision agriculture has likewise seen an increase in the application of computer vision technologies to the field, including the application of computer vision to crop phenotyping. AI-empowered phenotyping can significantly reduce labor burdens and risks, providing efficient and high-throughput tools to process numerous plants efficiently. This paper explores two applications of computer vision to challenges in horticultural phenotyping in three-dimensional (3D) space. The first work addresses the segmentation and reconstruction of root systems from X-ray computed tomography (CT) images. Accurate phenotyping of root system architecture (RSA) is a significant challenge in horticultural phenotyping because the root system is below the soil, occluding it from view and requiring traditional phenotyping techniques to remove the root system from the soil. Segmentation and reconstruction of the root system from X-ray CT images is therefore integral to observing the root system undisturbed in the soil and over time in response to abiotic and biotic stress. Numerous image-processing techniques have been applied to the problem of segmentation of below soil root systems from CT images. However, these methods often require user intervention and require input parameters tuned to plant species and soil conditions. A recent deep learning approach employed a volumetric encoder-decoder to achieve high scores for common computer vision accuracy metrics. However, training a volumetric model relies on copious amounts of hand-annotated training data. We propose using a two-dimensional (2D) Mask R-CNN model for instance segmentation of root cross sections in CT images. The 2D predictions can be merged to produce a 3D prediction. We also propose an automated parameter tuning pipeline for density-based spatial clustering of applications with noise (DBSCAN) to remove noise from the 3D segmentation. The proposed method was evaluated on scans of poinsettias and onions and achieved average scores of 0.734, 0.868, 0.749, and 0.669 for precision, recall, dice, and IoU, respectively, utilizing only 1% of the training dataset. Our proposed method is resource efficient, capitalizing on the training efficiency of a 2D model as well as 3D information during the unsupervised clustering using DBSCAN. Our second work addresses the use of stereo vision for 3D reconstruction of Humulus lupulus (hops) as well as a semi-automated pipeline for phenotypic trait extraction of vine length, leaf area, and biomass. High-throughput computer vision tools for phenotyping are important for variety trials of hops, as they facilitate fast and safe measurement taking as well as non-destructive measurement of leaf area and biomass. However, studies developing computer vision and machine learning for morphological phenotyping are not common for vine plants, and even less work has been completed for hops. Therefore, this work will develop and evaluate a method for hop morphological phenotyping. A 2D transformer, SegFormer, was trained for semantic segmentation of the hops and used to segment hop plants from 3D point cloud scenes retrieved from a ZED 2 stereo camera. Measurements of vine length, leaf area, and biomass were derived from the segmented point clouds, yielding high R2 values of 0.79, 0.95, and 0.91, respectively, indicative of a strong correlation between the derived measurements and the ground truth measurements.