Hierarchical deep learning framework for automated marine vegetation and fauna analysis using ROV video data
The integration of deep learning with Remotely Operated Vehicles (ROVs) has advanced scalable, detailed marine biodiversity monitoring. This study presents the Esefjorden Marine Vegetation Segmentation Dataset (EMVSD) and FjordVision, a framework designed for automated analysis of marine vegetation and fauna in natural habitats. FjordVision combines state-of-the-art object detection, iterative dataset refinement, and a taxonomy-aware hierarchical reclassification framework that enhances accuracy across four taxonomic levels: binary, class, genus, and species. Although YOLOv8 was initially employed for instance segmentation, results showed Mask R-CNN to be more effective across hierarchical levels. FjordVision’s hierarchical classification supports marine biodiversity assessments, offering critical insights for conservation applications in fjord ecosystems.